Upgrade from 2.31 to 2.32 causing disk space issues

Hi there,

I’m currently trying to upgrade from DHIS2 2.31.5 (CI build) to DHIS2 2.32.1 (CI build) and can see that the Flyway migration seems to be taking an excessive amount of time and quite a lot of disk space.

The original database was around 20GB, it runs for over an hour, slowly taking up more and more space, until it finally uses up the 100GB of disk space and crashes.

I’ve set max_locks_per_transaction to 1024 and the PostGIS extension is installed. The DB is running Postgresql 11 (on an AWS RDS instance with 4GB RAM). From the postgres logs, I see a whole bunch of buffer write outs every few minutes when it gets to 1.2GB.

Any thoughts on what might be happening?

Cheers,

  • Jasper
1 Like

Hi,

Probably you have a lot of data in your datavalue table. When you run the upgrade, it seems some indexes are dropped and recreated in the datavalue table, which is a process that requires a constant increase in space used until it completes. Could be another table.

In general, the solution is to allow it to continue the process, as long as you ensure that the space is adequate. so unfortunately, once it stops on the way there is no clear way to clear that space that was used already - so you quickly run out with every try afterwards. Clear your space used first, however way is easier for you and try again - wait for a long as it takes.

You might need up to 1TB, and at the end it might use up to 30-50% of the space. It will also take some time, so you should be patient. Mignt be other errors too, so keep watching the logs for errors.

Anyhow, some thoughts.

Regards,

Ranga

2 Likes

Hi Jasper

I am not sure if this will help with your overall problem, and it might well be that you have bigger issues. But considering just improving performance it is worthwhile considering that the application is in quite a different mode to usual when it is doing an upgrade. Looks like you might have lots of creation of new indexes etc. It might be worthwhile to configure your database a bit like it is more optimised for restoration. See for example http://www.databasesoup.com/2014/09/settings-for-fast-pgrestore.html for some tips.

Assuming you have made backups before attempting upgrade, you can safely do away with some of the normal protections like fsync and just try and get the job done as quick as possible.

That is (yet another) good reason to include all your optimisations in a single file and include it at the end of postgresql.conf rather than pepper changes through the file. Then you can have different configs which you optimize for different “profiles”.

Cheers
Bob

1 Like

Thanks for the responses @bobj and @Ranga

I looked at the logs for both upgrade attempts and it seemed it was always failing on: “alter table datavalueaudit alter column attributeoptioncomboid type bigint”. The datavalueaudit table had 63 million rows in it whereas the datavalue table itself had 5 million, not sure if that sounds about right.

As this is a test database I was happy to simply truncate the datavalueaudit table. After doing this the upgrade completed in about 7 mins (instead of > 1hr) and only took an additional 7 GB (instead of > 100GB). I don’t think this would have been expected, there definitely feels like there was something wrong with the datavalueaudit table or the migration itself.

I was more so interested in seeing what potential issues there might be when doing this upgrade in any production instances.

Anyway, hope this might be helpful for others to consider when doing the same upgrade. Although it’s quite possible there was something corrupt with our datavalueaudit table.

Cheers,

  • Jasper
2 Likes

Just adding in that I had a similar problem while testing the 2.31 to 2.32 upgrade on a database with a very large datavalueaudit table. I increased the disk space 5x and still ran out of space. I eventually came to the same conclusion as Jasper and cleared the datavalueaudit table, and then it proceeded normally.

I suspect that this will be a problem for a lot of people with large databases as they start moving from 2.31 to 2.32 and may merit some guidance in the 2.32 upgrade notes.

1 Like