Major speed issue with pg_restore

Edward_Robinson · 14 July 2017 01:54

I’ve setup a new Ubuntu 16.04 box with PostgreSQL 9.5.5 and I’m restoring a fairly large DHIS2 backup but having speed issues. It’s a full pg_dump in custom format and about 650Gb compressed (a plain text dump produces a 7Gb file). I made sure inserts were turned off, so that’s not the issue, but so far it’s been running for 33 hours – CPU at 100% - with no end in sight. This is a backup file that took 30 minutes to generate.

If it is running synchronously, I calculated that it’s on around 20 million DB rows of around 170 million after 33 hours! Surely that’s not normal.

I’ve tweaked PostgreSQL with the following settings:

maintenance_work_mem=2GB

max_wal_size = 1Gb

checkpoint_timeout = 3600

checkpoint_completion_target = 0.9

Anyway, if anyone has insight or has had a similar experience, or suggestions, please let me know!

I’m testing it on another (Windows) instance to see if there’s something amiss.

Cheers,

Ed

Knut_Staring · 14 July 2017 01:58

Perhaps useful to also send this question to a postgres forum?

···

On Jul 13, 2017 9:55 PM, “Edward Robinson” erobinson@projectbalance.com wrote:

I’ve setup a new Ubuntu 16.04 box with PostgreSQL 9.5.5 and I’m restoring a fairly large DHIS2 backup but having speed issues. It’s a full pg_dump in custom format and about 650Gb compressed (a plain text dump produces a 7Gb file). I made sure inserts were turned off, so that’s not the issue, but so far it’s been running for 33 hours – CPU at 100% - with no end in sight. This is a backup file that took 30 minutes to generate.

If it is running synchronously, I calculated that it’s on around 20 million DB rows of around 170 million after 33 hours! Surely that’s not normal.

I’ve tweaked PostgreSQL with the following settings:

maintenance_work_mem=2GB

max_wal_size = 1Gb

checkpoint_timeout = 3600

checkpoint_completion_target = 0.9

Anyway, if anyone has insight or has had a similar experience, or suggestions, please let me know!

I’m testing it on another (Windows) instance to see if there’s something amiss.

Cheers,

Ed

Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Edward_Robinson · 14 July 2017 02:07

Agree, I’m sure it’s a PostgreSQL / Ubuntu 16.04 issue. I’m posting in as many locations as possible. Seems I’m not alone

https://serverfault.com/questions/790723/postgres-9-5-restoring-dumps-is-very-very-slow

https://www.questarter.com/q/restoring-postgres-dump-is-very-very-slow-using-on-ubuntu-16-04-7_171095.html

Ed

···

Perhaps useful to also send this question to a postgres forum?

On Jul 13, 2017 9:55 PM, “Edward Robinson” erobinson@projectbalance.com wrote:

I’ve setup a new Ubuntu 16.04 box with PostgreSQL 9.5.5 and I’m restoring a fairly large DHIS2 backup but having speed issues. It’s a full pg_dump in custom format and about 650Gb compressed (a plain text dump produces a 7Gb file). I made sure inserts were turned off, so that’s not the issue, but so far it’s been running for 33 hours – CPU at 100% - with no end in sight. This is a backup file that took 30 minutes to generate.

If it is running synchronously, I calculated that it’s on around 20 million DB rows of around 170 million after 33 hours! Surely that’s not normal.

I’ve tweaked PostgreSQL with the following settings:

maintenance_work_mem=2GB

max_wal_size = 1Gb

checkpoint_timeout = 3600

checkpoint_completion_target = 0.9

Anyway, if anyone has insight or has had a similar experience, or suggestions, please let me know!

I’m testing it on another (Windows) instance to see if there’s something amiss.

Cheers,

Ed

Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

bobj · 14 July 2017 12:02

Hi Edward

Can you tell us a bit more about the machine. In particular RAM size
and disk type/speed and filesystem type.

Given that you are probably not going to be running anything else on
this machine while you are restoring I think you might be able to
tweak a bit more aggressively just for this operation.

You don't mention shared_buffers above? That is a dramatic control
lever. Is that just something you left off your mail?

Bob

···

On 14 July 2017 at 02:54, Edward Robinson <erobinson@projectbalance.com> wrote:

I’ve setup a new Ubuntu 16.04 box with PostgreSQL 9.5.5 and I’m restoring a
fairly large DHIS2 backup but having speed issues. It’s a full pg_dump in
custom format and about 650Gb compressed (a plain text dump produces a 7Gb
file). I made sure inserts were turned off, so that’s not the issue, but so
far it’s been running for 33 hours – CPU at 100% - with no end in sight.
This is a backup file that took 30 minutes to generate.

If it is running synchronously, I calculated that it’s on around 20 million
DB rows of around 170 million after 33 hours! Surely that’s not normal.

I’ve tweaked PostgreSQL with the following settings:

maintenance_work_mem=2GB

max_wal_size = 1Gb

checkpoint_timeout = 3600

checkpoint_completion_target = 0.9

Anyway, if anyone has insight or has had a similar experience, or
suggestions, please let me know!

I’m testing it on another (Windows) instance to see if there’s something
amiss.

Cheers,

Ed

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp

bobj · 14 July 2017 12:07

I am also trying to understand how a plain sql dump is only 7G while
the custom compressed format is 650G. But given that is true, why
aren't you using the plain text dump?

···

On 14 July 2017 at 13:02, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

Hi Edward

Can you tell us a bit more about the machine. In particular RAM size
and disk type/speed and filesystem type.

Given that you are probably not going to be running anything else on
this machine while you are restoring I think you might be able to
tweak a bit more aggressively just for this operation.

You don't mention shared_buffers above? That is a dramatic control
lever. Is that just something you left off your mail?

Bob

On 14 July 2017 at 02:54, Edward Robinson <erobinson@projectbalance.com> wrote:

I’ve setup a new Ubuntu 16.04 box with PostgreSQL 9.5.5 and I’m restoring a
fairly large DHIS2 backup but having speed issues. It’s a full pg_dump in
custom format and about 650Gb compressed (a plain text dump produces a 7Gb
file). I made sure inserts were turned off, so that’s not the issue, but so
far it’s been running for 33 hours – CPU at 100% - with no end in sight.
This is a backup file that took 30 minutes to generate.

If it is running synchronously, I calculated that it’s on around 20 million
DB rows of around 170 million after 33 hours! Surely that’s not normal.

I’ve tweaked PostgreSQL with the following settings:

maintenance_work_mem=2GB

max_wal_size = 1Gb

checkpoint_timeout = 3600

checkpoint_completion_target = 0.9

Anyway, if anyone has insight or has had a similar experience, or
suggestions, please let me know!

I’m testing it on another (Windows) instance to see if there’s something
amiss.

Cheers,

Ed

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp

Edward_Robinson · 14 July 2017 12:20

Apologies, I meant 650Mb on the compressed.
I'll revert with more details shortly, it looks like a huge factor may be autovacuum being on (and possibly logging). I'm tweaking a little more then going to restart it.
Some sort of progress indicator would be nice for restoring large dumps like these.
I'll post my postgresql.conf and other details shortly. Thanks!

···

-----Original Message-----
From: Bob Jolliffe [mailto:bobjolliffe@gmail.com]
Sent: Friday, 14 July 2017 8:07 AM
To: Edward Robinson <erobinson@projectbalance.com>
Cc: dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Major speed issue with pg_restore

I am also trying to understand how a plain sql dump is only 7G while the custom compressed format is 650G. But given that is true, why aren't you using the plain text dump?

On 14 July 2017 at 13:02, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

Hi Edward

Can you tell us a bit more about the machine. In particular RAM size
and disk type/speed and filesystem type.

Given that you are probably not going to be running anything else on
this machine while you are restoring I think you might be able to
tweak a bit more aggressively just for this operation.

You don't mention shared_buffers above? That is a dramatic control
lever. Is that just something you left off your mail?

Bob

On 14 July 2017 at 02:54, Edward Robinson <erobinson@projectbalance.com> wrote:

I’ve setup a new Ubuntu 16.04 box with PostgreSQL 9.5.5 and I’m
restoring a fairly large DHIS2 backup but having speed issues. It’s
a full pg_dump in custom format and about 650Gb compressed (a plain
text dump produces a 7Gb file). I made sure inserts were turned off,
so that’s not the issue, but so far it’s been running for 33 hours – CPU at 100% - with no end in sight.
This is a backup file that took 30 minutes to generate.

If it is running synchronously, I calculated that it’s on around 20
million DB rows of around 170 million after 33 hours! Surely that’s not normal.

I’ve tweaked PostgreSQL with the following settings:

maintenance_work_mem=2GB

max_wal_size = 1Gb

checkpoint_timeout = 3600

checkpoint_completion_target = 0.9

Anyway, if anyone has insight or has had a similar experience, or
suggestions, please let me know!

I’m testing it on another (Windows) instance to see if there’s
something amiss.

Cheers,

Ed

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp

bobj · 14 July 2017 12:39

Yes I saw that note on autovacuum in one of the links you gave.
Certainly can do no harm to turn it off during restoration.

Regarding postgresql.conf, it is sometimes worth taking into account
that later settings in the file override earlier ones. This means
that you can gather all your tweaks together at the bottom of the file
rather than have them scattered throughout. Along the same lines, you
can also just include them in a separate customisation file by having
something like

include 'mytweaks.conf'

at the bottom of the file. This might be useful if you wanted to have
a configuration profile optimized for a particular type of operation.

···

On 14 July 2017 at 13:20, Edward Robinson <erobinson@projectbalance.com> wrote:

Apologies, I meant 650Mb on the compressed.
I'll revert with more details shortly, it looks like a huge factor may be autovacuum being on (and possibly logging). I'm tweaking a little more then going to restart it.
Some sort of progress indicator would be nice for restoring large dumps like these.
I'll post my postgresql.conf and other details shortly. Thanks!

-----Original Message-----
From: Bob Jolliffe [mailto:bobjolliffe@gmail.com]
Sent: Friday, 14 July 2017 8:07 AM
To: Edward Robinson <erobinson@projectbalance.com>
Cc: dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Major speed issue with pg_restore

I am also trying to understand how a plain sql dump is only 7G while the custom compressed format is 650G. But given that is true, why aren't you using the plain text dump?

On 14 July 2017 at 13:02, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

Hi Edward

Can you tell us a bit more about the machine. In particular RAM size
and disk type/speed and filesystem type.

Given that you are probably not going to be running anything else on
this machine while you are restoring I think you might be able to
tweak a bit more aggressively just for this operation.

You don't mention shared_buffers above? That is a dramatic control
lever. Is that just something you left off your mail?

Bob

On 14 July 2017 at 02:54, Edward Robinson <erobinson@projectbalance.com> wrote:

I’ve setup a new Ubuntu 16.04 box with PostgreSQL 9.5.5 and I’m
restoring a fairly large DHIS2 backup but having speed issues. It’s
a full pg_dump in custom format and about 650Gb compressed (a plain
text dump produces a 7Gb file). I made sure inserts were turned off,
so that’s not the issue, but so far it’s been running for 33 hours – CPU at 100% - with no end in sight.
This is a backup file that took 30 minutes to generate.

If it is running synchronously, I calculated that it’s on around 20
million DB rows of around 170 million after 33 hours! Surely that’s not normal.

I’ve tweaked PostgreSQL with the following settings:

maintenance_work_mem=2GB

max_wal_size = 1Gb

checkpoint_timeout = 3600

checkpoint_completion_target = 0.9

Anyway, if anyone has insight or has had a similar experience, or
suggestions, please let me know!

I’m testing it on another (Windows) instance to see if there’s
something amiss.

Cheers,

Ed

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp

Hannan · 14 July 2017 13:05

Hi Ed

Agree with Bob; it deepens on processor, ram disc space and configuration. You database size also not seems too big.

Now my usual db backup size is 4.8 GB (compressed) and restoring time is usually 4 hour. Definitely something wrong with the configuration.

As I am on vacation, cannot help you much on this; might able to help you in one week later.

Sorry for the inconvenience.

Regards

Hannan

···

On Fri, Jul 14, 2017 at 7:54 AM, Edward Robinson erobinson@projectbalance.com wrote:

I’ve setup a new Ubuntu 16.04 box with PostgreSQL 9.5.5 and I’m restoring a fairly large DHIS2 backup but having speed issues. It’s a full pg_dump in custom format and about 650Gb compressed (a plain text dump produces a 7Gb file). I made sure inserts were turned off, so that’s not the issue, but so far it’s been running for 33 hours – CPU at 100% - with no end in sight. This is a backup file that took 30 minutes to generate.

If it is running synchronously, I calculated that it’s on around 20 million DB rows of around 170 million after 33 hours! Surely that’s not normal.

I’ve tweaked PostgreSQL with the following settings:

maintenance_work_mem=2GB

max_wal_size = 1Gb

checkpoint_timeout = 3600

checkpoint_completion_target = 0.9

Anyway, if anyone has insight or has had a similar experience, or suggestions, please let me know!

I’m testing it on another (Windows) instance to see if there’s something amiss.

Cheers,

Ed

Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

–

Muhammad Abdul Hannan Khan

Team Leader

Support to the National HMIS

MIS, Director General of Health Service

Ministry of Health and Family Welfare

T +880-2- 58816459, 58816412 ext 118

F +88 02 58813 875

M+88 01819 239 241

M+88 01534 312 066

E hannank@gmail.com

S hannan.khan.dhaka

B hannan-tech.blogspot.com

L https://bd.linkedin.com/in/hannankhan

Edward_Robinson · 15 July 2017 17:40

Sorry for this slow response, time has not been on my side
Also, thanks Bob, Knut and Hannan for your responses.

So to fill everyone in, I went through postgres.conf with a fine toothed comb. @ Bob, Shared buffers were set to 3200 though I'd set them to 1600 for the restore as per this article's suggestions:

Overall I'm convinced the PGSQL environment had nothing to do with the issue... what I had suspect is that the combination of Ubuntu 16.04 and our hardware was not a good match. I suspect the SATA channel in particular was the bottleneck.
To prove a point (I have two identical machines), I loaded Win Server 2012 R2 and and installed PGSQL 9.5.5 on the box and, without so much as tweaking a single file, I restored the dump within 50 minutes. We're likely upgrading the platform soon so I'm not going to break my back troubleshooting what is likely an issue that won't be there in 6 months time. Dell didn't include Debian or Ubuntu on their list of supported OS for the hardware combination, though I have no reason to believe it couldn't work with a little effort and a bit of old school compiling, I just don't have the luxury of time to fiddle. It worked well with 12.04 out of the box.
I just thought I'd post this in case anyone else was having similar trouble - if you've double checked the configs, it's quite possibly a lower level issue.
Cheers!
Ed

bobj · 15 July 2017 19:02

Ed you never answered my earlier question about the hardware you were running on?

···

On 15 Jul 2017 6:40 p.m., “Edward Robinson” erobinson@projectbalance.com wrote:

Sorry for this slow response, time has not been on my side

Also, thanks Bob, Knut and Hannan for your responses.

So to fill everyone in, I went through postgres.conf with a fine toothed comb. @ Bob, Shared buffers were set to 3200 though I’d set them to 1600 for the restore as per this article’s suggestions:

http://www.databasesoup.com/2014/09/settings-for-fast-pgrestore.html

Overall I’m convinced the PGSQL environment had nothing to do with the issue… what I had suspect is that the combination of Ubuntu 16.04 and our hardware was not a good match. I suspect the SATA channel in particular was the bottleneck.

To prove a point (I have two identical machines), I loaded Win Server 2012 R2 and and installed PGSQL 9.5.5 on the box and, without so much as tweaking a single file, I restored the dump within 50 minutes. We’re likely upgrading the platform soon so I’m not going to break my back troubleshooting what is likely an issue that won’t be there in 6 months time. Dell didn’t include Debian or Ubuntu on their list of supported OS for the hardware combination, though I have no reason to believe it couldn’t work with a little effort and a bit of old school compiling, I just don’t have the luxury of time to fiddle. It worked well with 12.04 out of the box.

I just thought I’d post this in case anyone else was having similar trouble - if you’ve double checked the configs, it’s quite possibly a lower level issue.

Cheers!

Ed

Edward_Robinson · 16 July 2017 20:43

Hi Bob, apologies, it’s running on this hardware:

http://www.dell.com/support/home/ht/en/htbsdt1/product-support/servicetag/d2fvfx1/configuration

A bit dated, but definitely not a hardware error given the speedy / successful restore on Windows.

Regards

Ed

···

Ed you never answered my earlier question about the hardware you were running on?

On 15 Jul 2017 6:40 p.m., “Edward Robinson” erobinson@projectbalance.com wrote:

Sorry for this slow response, time has not been on my side

Also, thanks Bob, Knut and Hannan for your responses.

So to fill everyone in, I went through postgres.conf with a fine toothed comb. @ Bob, Shared buffers were set to 3200 though I’d set them to 1600 for the restore as per this article’s suggestions:

http://www.databasesoup.com/2014/09/settings-for-fast-pgrestore.html

Overall I’m convinced the PGSQL environment had nothing to do with the issue… what I had suspect is that the combination of Ubuntu 16.04 and our hardware was not a good match. I suspect the SATA channel in particular was the bottleneck.

To prove a point (I have two identical machines), I loaded Win Server 2012 R2 and and installed PGSQL 9.5.5 on the box and, without so much as tweaking a single file, I restored the dump within 50 minutes. We’re likely upgrading the platform soon so I’m not going to break my back troubleshooting what is likely an issue that won’t be there in 6 months time. Dell didn’t include Debian or Ubuntu on their list of supported OS for the hardware combination, though I have no reason to believe it couldn’t work with a little effort and a bit of old school compiling, I just don’t have the luxury of time to fiddle. It worked well with 12.04 out of the box.

I just thought I’d post this in case anyone else was having similar trouble - if you’ve double checked the configs, it’s quite possibly a lower level issue.

Cheers!

Ed

Lars · 17 July 2017 12:41

Hi Edward,

this is besides the main point but make sure you create pg dumps without the analytics tables, and instead let DHIS 2 re-generate the analytics tables (which is faster as indexing is run in parallel).

pg_dump -T analytics*

regards,

Lars

···

On Sun, Jul 16, 2017 at 10:43 PM, Edward Robinson erobinson@projectbalance.com wrote:

Hi Bob, apologies, it’s running on this hardware:

http://www.dell.com/support/home/ht/en/htbsdt1/product-support/servicetag/d2fvfx1/configuration

A bit dated, but definitely not a hardware error given the speedy / successful restore on Windows.

Regards

Ed

From: Bob Jolliffe [mailto:bobjolliffe@gmail.com]

Sent: Saturday, 15 July 2017 3:02 PM

To: Edward Robinson erobinson@projectbalance.com

Cc: Knut Staring knutst@gmail.com; dhis2-users dhis2-users@lists.launchpad.net; Hannan Khan hannank@gmail.com

Subject: RE: [Dhis2-users] Major speed issue with pg_restore

Ed you never answered my earlier question about the hardware you were running on?

On 15 Jul 2017 6:40 p.m., “Edward Robinson” erobinson@projectbalance.com wrote:

Sorry for this slow response, time has not been on my side

Also, thanks Bob, Knut and Hannan for your responses.

So to fill everyone in, I went through postgres.conf with a fine toothed comb. @ Bob, Shared buffers were set to 3200 though I’d set them to 1600 for the restore as per this article’s suggestions:

http://www.databasesoup.com/2014/09/settings-for-fast-pgrestore.html

Overall I’m convinced the PGSQL environment had nothing to do with the issue… what I had suspect is that the combination of Ubuntu 16.04 and our hardware was not a good match. I suspect the SATA channel in particular was the bottleneck.

To prove a point (I have two identical machines), I loaded Win Server 2012 R2 and and installed PGSQL 9.5.5 on the box and, without so much as tweaking a single file, I restored the dump within 50 minutes. We’re likely upgrading the platform soon so I’m not going to break my back troubleshooting what is likely an issue that won’t be there in 6 months time. Dell didn’t include Debian or Ubuntu on their list of supported OS for the hardware combination, though I have no reason to believe it couldn’t work with a little effort and a bit of old school compiling, I just don’t have the luxury of time to fiddle. It worked well with 12.04 out of the box.

I just thought I’d post this in case anyone else was having similar trouble - if you’ve double checked the configs, it’s quite possibly a lower level issue.

Cheers!

Ed

Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

–

Lars Helge Øverland

Lead developer, DHIS 2

University of Oslo

Skype: larshelgeoverland

lars@dhis2.org

http://www.dhis2.org

Edward_Robinson · 17 July 2017 16:19

That is useful info for the forum for sure and I fully agree, thanks Lars!

Ultimately the server supports Linux – it’s just certified for different distros. Also, I noticed a number of Ubuntu 16.04 users complaining of the same speed issue, so while Ubuntu will probably remain my go-to distro, in the case of these servers, I’m probably better off getting Suse or one of the other supported distros on before I take any more troubleshooting steps. For now it’s running fine and the future direction may be different – including cloud hosting to avoid this kind of hassle.

Thanks everyone for the support,

Cheers

Ed

···

Hi Edward,

this is besides the main point but make sure you create pg dumps without the analytics tables, and instead let DHIS 2 re-generate the analytics tables (which is faster as indexing is run in parallel).

pg_dump -T analytics*

regards,

Lars

On Sun, Jul 16, 2017 at 10:43 PM, Edward Robinson erobinson@projectbalance.com wrote:

Hi Bob, apologies, it’s running on this hardware:

http://www.dell.com/support/home/ht/en/htbsdt1/product-support/servicetag/d2fvfx1/configuration

A bit dated, but definitely not a hardware error given the speedy / successful restore on Windows.

Regards

Ed

From: Bob Jolliffe [mailto:bobjolliffe@gmail.com]

Sent: Saturday, 15 July 2017 3:02 PM

To: Edward Robinson erobinson@projectbalance.com

Cc: Knut Staring knutst@gmail.com; dhis2-users dhis2-users@lists.launchpad.net; Hannan Khan hannank@gmail.com

Subject: RE: [Dhis2-users] Major speed issue with pg_restore

Ed you never answered my earlier question about the hardware you were running on?

On 15 Jul 2017 6:40 p.m., “Edward Robinson” erobinson@projectbalance.com wrote:

Sorry for this slow response, time has not been on my side

Also, thanks Bob, Knut and Hannan for your responses.

So to fill everyone in, I went through postgres.conf with a fine toothed comb. @ Bob, Shared buffers were set to 3200 though I’d set them to 1600 for the restore as per this article’s suggestions:

http://www.databasesoup.com/2014/09/settings-for-fast-pgrestore.html

Overall I’m convinced the PGSQL environment had nothing to do with the issue… what I had suspect is that the combination of Ubuntu 16.04 and our hardware was not a good match. I suspect the SATA channel in particular was the bottleneck.

To prove a point (I have two identical machines), I loaded Win Server 2012 R2 and and installed PGSQL 9.5.5 on the box and, without so much as tweaking a single file, I restored the dump within 50 minutes. We’re likely upgrading the platform soon so I’m not going to break my back troubleshooting what is likely an issue that won’t be there in 6 months time. Dell didn’t include Debian or Ubuntu on their list of supported OS for the hardware combination, though I have no reason to believe it couldn’t work with a little effort and a bit of old school compiling, I just don’t have the luxury of time to fiddle. It worked well with 12.04 out of the box.

I just thought I’d post this in case anyone else was having similar trouble - if you’ve double checked the configs, it’s quite possibly a lower level issue.

Cheers!

Ed

Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

–

Lars Helge Øverland

Lead developer, DHIS 2

University of Oslo

Skype: larshelgeoverland

lars@dhis2.org

http://www.dhis2.org

bobj · 17 July 2017 18:32

It is unlikely (though not impossible) that the problem is ubuntu specific. It could be your windows for example, is making better use of the disk.

I think I see you had 2 sata disks which would probably work best in a raid 0 setup. Perhaps windows has arranged them thus. Maybe also some other tweaks to get the best out of the filesystem. Good luck.

···

On 17 July 2017 at 17:19, Edward Robinson erobinson@projectbalance.com wrote:

That is useful info for the forum for sure and I fully agree, thanks Lars!

Ultimately the server supports Linux – it’s just certified for different sistros. Also, I noticed a number of Ubuntu 16.04 users complaining of the same speed issue, so while Ubuntu will probably remain my go-to distro, in the case of these servers, I’m probably better off getting Suse or one of the other supported distros on before I take any more troubleshooting steps. For now it’s running fine and the future direction may be different – including cloud hosting to avoid this kind of hassle.

Thanks everyone for the support,

Cheers

Ed

From: Lars Helge Øverland [mailto:lars@dhis2.org]

Sent: Monday, 17 July 2017 8:41 AM

To: Edward Robinson erobinson@projectbalance.com

Cc: Bob Jolliffe bobjolliffe@gmail.com; dhis2-users dhis2-users@lists.launchpad.net

Subject: Re: [Dhis2-users] Major speed issue with pg_restore

Hi Edward,

this is besides the main point but make sure you create pg dumps without the analytics tables, and instead let DHIS 2 re-generate the analytics tables (which is faster as indexing is run in parallel).

pg_dump -T analytics*

regards,

Lars

On Sun, Jul 16, 2017 at 10:43 PM, Edward Robinson erobinson@projectbalance.com wrote:

Hi Bob, apologies, it’s running on this hardware:

http://www.dell.com/support/home/ht/en/htbsdt1/product-support/servicetag/d2fvfx1/configuration

A bit dated, but definitely not a hardware error given the speedy / successful restore on Windows.

Regards

Ed

From: Bob Jolliffe [mailto:bobjolliffe@gmail.com]

Sent: Saturday, 15 July 2017 3:02 PM

To: Edward Robinson erobinson@projectbalance.com

Cc: Knut Staring knutst@gmail.com; dhis2-users dhis2-users@lists.launchpad.net; Hannan Khan hannank@gmail.com

Subject: RE: [Dhis2-users] Major speed issue with pg_restore

Ed you never answered my earlier question about the hardware you were running on?

On 15 Jul 2017 6:40 p.m., “Edward Robinson” erobinson@projectbalance.com wrote:

Sorry for this slow response, time has not been on my side

Also, thanks Bob, Knut and Hannan for your responses.

So to fill everyone in, I went through postgres.conf with a fine toothed comb. @ Bob, Shared buffers were set to 3200 though I’d set them to 1600 for the restore as per this article’s suggestions:

http://www.databasesoup.com/2014/09/settings-for-fast-pgrestore.html

Overall I’m convinced the PGSQL environment had nothing to do with the issue… what I had suspect is that the combination of Ubuntu 16.04 and our hardware was not a good match. I suspect the SATA channel in particular was the bottleneck.

To prove a point (I have two identical machines), I loaded Win Server 2012 R2 and and installed PGSQL 9.5.5 on the box and, without so much as tweaking a single file, I restored the dump within 50 minutes. We’re likely upgrading the platform soon so I’m not going to break my back troubleshooting what is likely an issue that won’t be there in 6 months time. Dell didn’t include Debian or Ubuntu on their list of supported OS for the hardware combination, though I have no reason to believe it couldn’t work with a little effort and a bit of old school compiling, I just don’t have the luxury of time to fiddle. It worked well with 12.04 out of the box.

I just thought I’d post this in case anyone else was having similar trouble - if you’ve double checked the configs, it’s quite possibly a lower level issue.

Cheers!

Ed

Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

–

Lars Helge Øverland

Lead developer, DHIS 2

University of Oslo

Skype: larshelgeoverland

lars@dhis2.org

http://www.dhis2.org

Edward_Robinson · 18 July 2017 18:55

Maybe, but 2 days vs 50 minutes is a pretty big difference…

···

It is unlikely (though not impossible) that the problem is ubuntu specific. It could be your windows for example, is making better use of the disk.

I think I see you had 2 sata disks which would probably work best in a raid 0 setup. Perhaps windows has arranged them thus. Maybe also some other tweaks to get the best out of the filesystem. Good luck.

On 17 July 2017 at 17:19, Edward Robinson erobinson@projectbalance.com wrote:

That is useful info for the forum for sure and I fully agree, thanks Lars!

Ultimately the server supports Linux – it’s just certified for different sistros. Also, I noticed a number of Ubuntu 16.04 users complaining of the same speed issue, so while Ubuntu will probably remain my go-to distro, in the case of these servers, I’m probably better off getting Suse or one of the other supported distros on before I take any more troubleshooting steps. For now it’s running fine and the future direction may be different – including cloud hosting to avoid this kind of hassle.

Thanks everyone for the support,

Cheers

Ed

From: Lars Helge Øverland [mailto:lars@dhis2.org]

Sent: Monday, 17 July 2017 8:41 AM

To: Edward Robinson erobinson@projectbalance.com

Cc: Bob Jolliffe bobjolliffe@gmail.com; dhis2-users dhis2-users@lists.launchpad.net

Subject: Re: [Dhis2-users] Major speed issue with pg_restore

Hi Edward,

this is besides the main point but make sure you create pg dumps without the analytics tables, and instead let DHIS 2 re-generate the analytics tables (which is faster as indexing is run in parallel).

pg_dump -T analytics*

regards,

Lars

On Sun, Jul 16, 2017 at 10:43 PM, Edward Robinson erobinson@projectbalance.com wrote:

Hi Bob, apologies, it’s running on this hardware:

http://www.dell.com/support/home/ht/en/htbsdt1/product-support/servicetag/d2fvfx1/configuration

A bit dated, but definitely not a hardware error given the speedy / successful restore on Windows.

Regards

Ed

From: Bob Jolliffe [mailto:bobjolliffe@gmail.com]

Sent: Saturday, 15 July 2017 3:02 PM

To: Edward Robinson erobinson@projectbalance.com

Cc: Knut Staring knutst@gmail.com; dhis2-users dhis2-users@lists.launchpad.net; Hannan Khan hannank@gmail.com

Subject: RE: [Dhis2-users] Major speed issue with pg_restore

Ed you never answered my earlier question about the hardware you were running on?

On 15 Jul 2017 6:40 p.m., “Edward Robinson” erobinson@projectbalance.com wrote:

Sorry for this slow response, time has not been on my side

Also, thanks Bob, Knut and Hannan for your responses.

So to fill everyone in, I went through postgres.conf with a fine toothed comb. @ Bob, Shared buffers were set to 3200 though I’d set them to 1600 for the restore as per this article’s suggestions:

http://www.databasesoup.com/2014/09/settings-for-fast-pgrestore.html

Overall I’m convinced the PGSQL environment had nothing to do with the issue… what I had suspect is that the combination of Ubuntu 16.04 and our hardware was not a good match. I suspect the SATA channel in particular was the bottleneck.

To prove a point (I have two identical machines), I loaded Win Server 2012 R2 and and installed PGSQL 9.5.5 on the box and, without so much as tweaking a single file, I restored the dump within 50 minutes. We’re likely upgrading the platform soon so I’m not going to break my back troubleshooting what is likely an issue that won’t be there in 6 months time. Dell didn’t include Debian or Ubuntu on their list of supported OS for the hardware combination, though I have no reason to believe it couldn’t work with a little effort and a bit of old school compiling, I just don’t have the luxury of time to fiddle. It worked well with 12.04 out of the box.

I just thought I’d post this in case anyone else was having similar trouble - if you’ve double checked the configs, it’s quite possibly a lower level issue.

Cheers!

Ed

Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

–

Lars Helge Øverland

Lead developer, DHIS 2

University of Oslo

Skype: larshelgeoverland

lars@dhis2.org

http://www.dhis2.org

bobj · 18 July 2017 22:26

It is. How are your disks set up on windows and Ubuntu?

In general the postgresql folk do not claim good performance on windows. So there is some other factor at play.

···

On 18 Jul 2017 7:55 p.m., “Edward Robinson” erobinson@projectbalance.com wrote:

Maybe, but 2 days vs 50 minutes is a pretty big difference…

From: Bob Jolliffe [mailto:bobjolliffe@gmail.com]

Sent: Monday, 17 July 2017 2:33 PM

To: Edward Robinson erobinson@projectbalance.com

Cc: Lars Helge Øverland lars@dhis2.org; dhis2-users dhis2-users@lists.launchpad.net

Subject: Re: [Dhis2-users] Major speed issue with pg_restore

It is unlikely (though not impossible) that the problem is ubuntu specific. It could be your windows for example, is making better use of the disk.

I think I see you had 2 sata disks which would probably work best in a raid 0 setup. Perhaps windows has arranged them thus. Maybe also some other tweaks to get the best out of the filesystem. Good luck.

On 17 July 2017 at 17:19, Edward Robinson erobinson@projectbalance.com wrote:

That is useful info for the forum for sure and I fully agree, thanks Lars!

Ultimately the server supports Linux – it’s just certified for different sistros. Also, I noticed a number of Ubuntu 16.04 users complaining of the same speed issue, so while Ubuntu will probably remain my go-to distro, in the case of these servers, I’m probably better off getting Suse or one of the other supported distros on before I take any more troubleshooting steps. For now it’s running fine and the future direction may be different – including cloud hosting to avoid this kind of hassle.

Thanks everyone for the support,

Cheers

Ed

From: Lars Helge Øverland [mailto:lars@dhis2.org]

Sent: Monday, 17 July 2017 8:41 AM

To: Edward Robinson erobinson@projectbalance.com

Cc: Bob Jolliffe bobjolliffe@gmail.com; dhis2-users dhis2-users@lists.launchpad.net

Subject: Re: [Dhis2-users] Major speed issue with pg_restore

Hi Edward,

this is besides the main point but make sure you create pg dumps without the analytics tables, and instead let DHIS 2 re-generate the analytics tables (which is faster as indexing is run in parallel).

pg_dump -T analytics*

regards,

Lars

On Sun, Jul 16, 2017 at 10:43 PM, Edward Robinson erobinson@projectbalance.com wrote:

Hi Bob, apologies, it’s running on this hardware:

http://www.dell.com/support/home/ht/en/htbsdt1/product-support/servicetag/d2fvfx1/configuration

A bit dated, but definitely not a hardware error given the speedy / successful restore on Windows.

Regards

Ed

From: Bob Jolliffe [mailto:bobjolliffe@gmail.com]

Sent: Saturday, 15 July 2017 3:02 PM

To: Edward Robinson erobinson@projectbalance.com

Cc: Knut Staring knutst@gmail.com; dhis2-users dhis2-users@lists.launchpad.net; Hannan Khan hannank@gmail.com

Subject: RE: [Dhis2-users] Major speed issue with pg_restore

Ed you never answered my earlier question about the hardware you were running on?

On 15 Jul 2017 6:40 p.m., “Edward Robinson” erobinson@projectbalance.com wrote:

Sorry for this slow response, time has not been on my side

Also, thanks Bob, Knut and Hannan for your responses.

So to fill everyone in, I went through postgres.conf with a fine toothed comb. @ Bob, Shared buffers were set to 3200 though I’d set them to 1600 for the restore as per this article’s suggestions:

http://www.databasesoup.com/2014/09/settings-for-fast-pgrestore.html

Overall I’m convinced the PGSQL environment had nothing to do with the issue… what I had suspect is that the combination of Ubuntu 16.04 and our hardware was not a good match. I suspect the SATA channel in particular was the bottleneck.

To prove a point (I have two identical machines), I loaded Win Server 2012 R2 and and installed PGSQL 9.5.5 on the box and, without so much as tweaking a single file, I restored the dump within 50 minutes. We’re likely upgrading the platform soon so I’m not going to break my back troubleshooting what is likely an issue that won’t be there in 6 months time. Dell didn’t include Debian or Ubuntu on their list of supported OS for the hardware combination, though I have no reason to believe it couldn’t work with a little effort and a bit of old school compiling, I just don’t have the luxury of time to fiddle. It worked well with 12.04 out of the box.

I just thought I’d post this in case anyone else was having similar trouble - if you’ve double checked the configs, it’s quite possibly a lower level issue.

Cheers!

Ed

Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

–

Lars Helge Øverland

Lead developer, DHIS 2

University of Oslo

Skype: larshelgeoverland

lars@dhis2.org

http://www.dhis2.org

Edward_Robinson · 19 July 2017 14:16

Agree, the system has two independent disks, I hadn’t set up any mirroring or striping – I had basically just setup the OS in both cases and performed OS updates. Pretty much a vanilla config on both. My experience is that if the SATA channel drivers aren’t properly loaded, the disks will work in legacy mode which slows things down considerably since no interrupts are used. In the old days with the IDE channel, it was interrupt 14 and 15 for channel 1 and 2 but that’s no longer the case with SATA. To be honest I haven’t come across this before in Linux since most good distros have all the software required to run all but the newest hardware and given that these machines are a few years old, I would assume they are supported (but I imagine this may be something worth troubleshooting). Ultimately I do think online hosting for this project will probably be the best given the difficulty supporting any hardware in this location so we are exploring those options.

Some things I hadn’t tried yet:

https://serverfault.com/questions/790723/postgres-9-5-restoring-dumps-is-very-very-slow

“

···

It does not seem to be a PG problem. More likely it’s a language or locale problem with PG on Ubuntu 16.04, because installing huge indexed tables with foreign languages and therefore with many utf-8 chars cause the problem. Do we miss a language package? On another server with Ubuntu 14.04 everything works fine.

“

It’s not this, since I had dumped using the default copy method:

http://manpages.ubuntu.com/manpages/xenial/man1/pg_dump.1.html

“**--column-inserts**

   **--attribute-inserts**

       Dump data as **INSERT** commands with explicit column names (INSERT

       INTO table (column, ...) VALUES ...). This will make restoration

       very slow; it is mainly useful for making dumps that can be loaded

       into non-PostgreSQL databases. However, since this option generates

       a separate command for each row, an error in reloading a row causes

       only that row to be lost rather than the entire table contents.

**--inserts**

           Dump data as **INSERT** commands (rather than **COPY**). This will make

           restoration very slow; it is mainly useful for making dumps that

           can be loaded into non-PostgreSQL databases. However, since this

           option generates a separate command for each row, an error in

           reloading a row causes only that row to be lost rather than the

           entire table contents. Note that the restore might fail altogether

           if you have rearranged column order. The **--column-inserts** option is

           safe against column order changes, though even slower.

“