Problems populating Orgunits directly

Hello,

I am trying to populate the database directly with an enormous orgunit tree of the whole world (173 000 orgunits…). I have imported data to the following 3 tables:

source, organisationunit, and orgunithierarchystructure. Do I need to populate any other tables?

Also, in the organisationunit table, I had to remove the uniqueness constraints for both name and shortname, as there were tens of thousands of duplicate names - could this be the source of the below problems? I can of course try and generate unique names in some way.

When starting DHIS, I get these messages:

  • INFO 17:49:09,272 Executing startup routine [3 of 12, runlevel 0]: OrganisationUnitHierarchyVerifier (DefaultStartupRoutineExecutor.java [Thread-1])

  • WARN 17:49:09,475 firstResult/maxResults specified with collection fetch; applying in memory! (QueryTranslatorImpl.java [Thread-1])

  • WARN 17:49:15,710 fail-safe cleanup (collections) : org.hibernate.engine.loading.CollectionLoadContext@114382drs=com.mchange.v2.c3p0.impl.NewProxyResultSet@a574b2 (LoadContexts.java [Thread-1])

  • WARN 17:49:15,710 On CollectionLoadContext#cleanup, localLoadingCollectionKeys contained [1] entries (CollectionLoadContext.java [Thread-1])

Any thoughts?

Knut

The difficulty stemmed from forgetting how to start the hierarchy: With NULL as parentid for the top level. The only problem now is that it’s a bit slow - perhaps we can add some indexing? Jason?

Uniqueness is not needed, and neither is the orgunithierarchystructure table - what is this for?

http://www.openhealthconsortium.org/wiki/doku.php?id=administrative_boundaries

···

On Fri, Sep 18, 2009 at 6:31 PM, Knut Staring knutst@gmail.com wrote:

Hello,

I am trying to populate the database directly with an enormous orgunit tree of the whole world (173 000 orgunits…). I have imported data to the following 3 tables:

source, organisationunit, and orgunithierarchystructure. Do I need to populate any other tables?

Also, in the organisationunit table, I had to remove the uniqueness constraints for both name and shortname, as there were tens of thousands of duplicate names - could this be the source of the below problems? I can of course try and generate unique names in some way.

When starting DHIS, I get these messages:

Hi Knut,
You should start with a vacuum/analyze if you are using postgresq as as start.

Can you give a dump of the DB, assuming it is not huge. Just the
orgunit tables will do. I can try and take a look at it .

Best regards,
Jason

···

On Sun, Sep 20, 2009 at 4:30 PM, Knut Staring <knutst@gmail.com> wrote:

On Fri, Sep 18, 2009 at 6:31 PM, Knut Staring <knutst@gmail.com> wrote:

Hello,
I am trying to populate the database directly with an enormous orgunit
tree of the whole world (173 000 orgunits...). I have imported data to the
following 3 tables:
source, organisationunit, and orgunithierarchystructure. Do I need to
populate any other tables?
Also, in the organisationunit table, I had to remove the uniqueness
constraints for both name and shortname, as there were tens of thousands of
duplicate names - could this be the source of the below problems? I can of
course try and generate unique names in some way.
When starting DHIS, I get these messages:

The difficulty stemmed from forgetting how to start the hierarchy: With NULL
as parentid for the top level. The only problem now is that it's a bit slow
- perhaps we can add some indexing? Jason?
Uniqueness is not needed, and neither is the orgunithierarchystructure table
- what is this for?

http://www.openhealthconsortium.org/wiki/doku.php?id=administrative_boundaries

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp

Yes, Vacuuming is a good idea. Also, things were speeded up when I created users that only have access to one region or country.

Right now I am adding 60000 facilities. Will make a backup file available soon.

Knut

···

On Mon, Sep 21, 2009 at 7:10 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Knut,

You should start with a vacuum/analyze if you are using postgresq as as start.

Can you give a dump of the DB, assuming it is not huge. Just the

orgunit tables will do. I can try and take a look at it .

Best regards,

Jason

On Sun, Sep 20, 2009 at 4:30 PM, Knut Staring knutst@gmail.com wrote:

On Fri, Sep 18, 2009 at 6:31 PM, Knut Staring knutst@gmail.com wrote:

Hello,

I am trying to populate the database directly with an enormous orgunit

tree of the whole world (173 000 orgunits…). I have imported data to the

following 3 tables:

source, organisationunit, and orgunithierarchystructure. Do I need to

populate any other tables?

Also, in the organisationunit table, I had to remove the uniqueness

constraints for both name and shortname, as there were tens of thousands of

duplicate names - could this be the source of the below problems? I can of

course try and generate unique names in some way.

When starting DHIS, I get these messages:

The difficulty stemmed from forgetting how to start the hierarchy: With NULL

as parentid for the top level. The only problem now is that it’s a bit slow

  • perhaps we can add some indexing? Jason?

Uniqueness is not needed, and neither is the orgunithierarchystructure table

  • what is this for?

http://www.openhealthconsortium.org/wiki/doku.php?id=administrative_boundaries


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Cheers,
Knut Staring

Hm…ran into some heap space problems, probably rooted in some other mistake, but will have to stop now for today. In the mean time, you can have a look at the dump without the facilities, available here:

http://97.107.130.50/files/

k

···

On Mon, Sep 21, 2009 at 7:37 PM, Knut Staring knutst@gmail.com wrote:

Yes, Vacuuming is a good idea. Also, things were speeded up when I created users that only have access to one region or country.

Right now I am adding 60000 facilities. Will make a backup file available soon.

Knut

On Mon, Sep 21, 2009 at 7:10 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Knut,

You should start with a vacuum/analyze if you are using postgresq as as start.

Can you give a dump of the DB, assuming it is not huge. Just the

orgunit tables will do. I can try and take a look at it .

Best regards,

Jason

On Sun, Sep 20, 2009 at 4:30 PM, Knut Staring knutst@gmail.com wrote:

On Fri, Sep 18, 2009 at 6:31 PM, Knut Staring knutst@gmail.com wrote:

Hello,

I am trying to populate the database directly with an enormous orgunit

tree of the whole world (173 000 orgunits…). I have imported data to the

following 3 tables:

source, organisationunit, and orgunithierarchystructure. Do I need to

populate any other tables?

Also, in the organisationunit table, I had to remove the uniqueness

constraints for both name and shortname, as there were tens of thousands of

duplicate names - could this be the source of the below problems? I can of

course try and generate unique names in some way.

When starting DHIS, I get these messages:

The difficulty stemmed from forgetting how to start the hierarchy: With NULL

as parentid for the top level. The only problem now is that it’s a bit slow

  • perhaps we can add some indexing? Jason?

Uniqueness is not needed, and neither is the orgunithierarchystructure table

  • what is this for?

http://www.openhealthconsortium.org/wiki/doku.php?id=administrative_boundaries


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Cheers,
Knut Staring


Cheers,
Knut Staring

You can now find a 5MB postgres backup file at http://97.107.130.50/files/ which has 245427 orgunits, including 60 000 facilities.

Also, the speed is much better when running the database on the same machine (I was using a server on the LAN before).

Though we have a lot of orgunit information now, there are certainly a lot missing (e.g. we don’t have facilities for many countries), and a lot of errors. My idea is that we could run this as a master database on a server (at UiO?), and give access to certain country users who can help correct it (we should already be in a position to improve things substantially for countries where DHIS has been introduced).

Knut

···

On Mon, Sep 21, 2009 at 7:41 PM, Knut Staring knutst@gmail.com wrote:

Hm…ran into some heap space problems, probably rooted in some other mistake, but will have to stop now for today. In the mean time, you can have a look at the dump without the facilities, available here:
http://97.107.130.50/files/

k

On Mon, Sep 21, 2009 at 7:37 PM, Knut Staring knutst@gmail.com wrote:

Yes, Vacuuming is a good idea. Also, things were speeded up when I created users that only have access to one region or country.

Right now I am adding 60000 facilities. Will make a backup file available soon.

Knut

On Mon, Sep 21, 2009 at 7:10 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Knut,

You should start with a vacuum/analyze if you are using postgresq as as start.

Can you give a dump of the DB, assuming it is not huge. Just the

orgunit tables will do. I can try and take a look at it .

Best regards,

Jason

On Sun, Sep 20, 2009 at 4:30 PM, Knut Staring knutst@gmail.com wrote:

On Fri, Sep 18, 2009 at 6:31 PM, Knut Staring knutst@gmail.com wrote:

Hello,

I am trying to populate the database directly with an enormous orgunit

tree of the whole world (173 000 orgunits…). I have imported data to the

following 3 tables:

source, organisationunit, and orgunithierarchystructure. Do I need to

populate any other tables?

Also, in the organisationunit table, I had to remove the uniqueness

constraints for both name and shortname, as there were tens of thousands of

duplicate names - could this be the source of the below problems? I can of

course try and generate unique names in some way.

When starting DHIS, I get these messages:

The difficulty stemmed from forgetting how to start the hierarchy: With NULL

as parentid for the top level. The only problem now is that it’s a bit slow

  • perhaps we can add some indexing? Jason?

Uniqueness is not needed, and neither is the orgunithierarchystructure table

  • what is this for?

http://www.openhealthconsortium.org/wiki/doku.php?id=administrative_boundaries


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Cheers,
Knut Staring


Cheers,
Knut Staring


Cheers,
Knut Staring