Data import

Hello,

As DHIS 2 becomes increasingly popular, it becomes more and more urgent to be able to import metadata and data from other systems.

We have had discussions of using a Extract, transfer, load tools (e.g. Kettle/Pentaho Data Integration) - in which case they probably should be embedded in some way?

Ola and I just came up with a simple format now which we think could represent many types of data “out there”. That is, people should be guided in transforming their data into a simple tabular format, which could then be read by DHIS.

We have updated an existing blueprint:

https://blueprints.launchpad.net/dhis2/+spec/import-data

As an alternative approach could be to require people to convert their data to SDMX or DXF, it would be good to have everyone’s thoughts on this.

Knut

Hi there.

One of the problems with ETL is one must be a bit careful about
ignoring certain business rules that may be built into the procedural
code, but not the DB. It seems that most of DHIS2s logic is built into
the procedural layer (Java / JS) and not the DB itself. Of course
there is some there...foreign key relations and such. I think it could
be a bit risky, but it depends on which tables you are touching. I
like the idea of transformation to DXF, followed by importation, as we
can be sure that all the business logic will be enforced through the
GUI. Direct injection of routine/semipermanent data has gone quite
smoothly for me. I imported a big hunk of population data with Kettle,
and it worked OK. I am not sure there will be a single transformation
that we can offer people, but rather some well documented examples and
tips. Every single data source that is a candidate for transformation
will probably be very different and require some level of
customization.

It might be good to start a branch somewhere "contrib" so we can begin
to collaborate on these issues. it has been mentioned in the past.
Perhaps a "contrib/etl" and "contrib/reports" would be good. I have
developed a couple of generic BIRT reports that others may find
useful. I would think ETL transforms would be useful as well to start
to assemble. I can do it if there is consensus with the group.

Regards,
Jason

···

On Mon, Nov 2, 2009 at 5:36 PM, Knut Staring <knutst@gmail.com> wrote:

Hello,
As DHIS 2 becomes increasingly popular, it becomes more and more urgent to
be able to import metadata and data from other systems.
We have had discussions of using a Extract, transfer, load tools (e.g.
Kettle/Pentaho Data Integration) - in which case they probably should be
embedded in some way?
Ola and I just came up with a simple format now which we think could
represent many types of data "out there". That is, people should be guided
in transforming their data into a simple tabular format, which could then be
read by DHIS.
We have updated an existing blueprint:
https://blueprints.launchpad.net/dhis2/+spec/import-data
As an alternative approach could be to require people to convert their data
to SDMX or DXF, it would be good to have everyone's thoughts on this.
Knut
_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp

Sharing generic BIRT reports would be useful. We already have a “resources” dir in trunk. If these things are not too big feel free to put them in a subdir there.

Lars

···

On Mon, Nov 2, 2009 at 5:21 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi there.

One of the problems with ETL is one must be a bit careful about

ignoring certain business rules that may be built into the procedural

code, but not the DB. It seems that most of DHIS2s logic is built into

the procedural layer (Java / JS) and not the DB itself. Of course

there is some there…foreign key relations and such. I think it could

be a bit risky, but it depends on which tables you are touching. I

like the idea of transformation to DXF, followed by importation, as we

can be sure that all the business logic will be enforced through the

GUI. Direct injection of routine/semipermanent data has gone quite

smoothly for me. I imported a big hunk of population data with Kettle,

and it worked OK. I am not sure there will be a single transformation

that we can offer people, but rather some well documented examples and

tips. Every single data source that is a candidate for transformation

will probably be very different and require some level of

customization.

It might be good to start a branch somewhere “contrib” so we can begin

to collaborate on these issues. it has been mentioned in the past.

Perhaps a “contrib/etl” and “contrib/reports” would be good. I have

developed a couple of generic BIRT reports that others may find

useful. I would think ETL transforms would be useful as well to start

to assemble. I can do it if there is consensus with the group.

Hi there.

One of the problems with ETL is one must be a bit careful about

ignoring certain business rules that may be built into the procedural

code, but not the DB. It seems that most of DHIS2s logic is built into

the procedural layer (Java / JS) and not the DB itself.

Of course

there is some there…foreign key relations and such.

The recommended option should be to go through a DHIS 2 GUI which uses the API and makes sure the validations as well as database foreign key constraints etc get handled. Connecting to CSV is one option, DXF another.

I think it could

be a bit risky, but it depends on which tables you are touching. I

like the idea of transformation to DXF, followed by importation, as we

can be sure that all the business logic will be enforced through the

GUI.
Direct injection of routine/semipermanent data has gone quite

smoothly for me. I imported a big hunk of population data with Kettle,

and it worked OK. I am not sure there will be a single transformation

that we can offer people, but rather some well documented examples and

tips. Every single data source that is a candidate for transformation

will probably be very different and require some level of

customization.

Indeed. I think it makes a lot of sense to have Kettle as an alternative which could help people transform their data to DXF, with a few good examples that can be modified to the particular setting.

Knut

···

On Mon, Nov 2, 2009 at 5:21 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

It might be good to start a branch somewhere “contrib” so we can begin

to collaborate on these issues. it has been mentioned in the past.

Perhaps a “contrib/etl” and “contrib/reports” would be good. I have

developed a couple of generic BIRT reports that others may find

useful. I would think ETL transforms would be useful as well to start

to assemble. I can do it if there is consensus with the group.

Regards,

Jason

On Mon, Nov 2, 2009 at 5:36 PM, Knut Staring knutst@gmail.com wrote:

Hello,

As DHIS 2 becomes increasingly popular, it becomes more and more urgent to

be able to import metadata and data from other systems.

We have had discussions of using a Extract, transfer, load tools (e.g.

Kettle/Pentaho Data Integration) - in which case they probably should be

embedded in some way?

Ola and I just came up with a simple format now which we think could

represent many types of data “out there”. That is, people should be guided

in transforming their data into a simple tabular format, which could then be

read by DHIS.

We have updated an existing blueprint:

https://blueprints.launchpad.net/dhis2/+spec/import-data

As an alternative approach could be to require people to convert their data

to SDMX or DXF, it would be good to have everyone’s thoughts on this.

Knut


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Cheers,
Knut Staring