Data Import tools for DHIS2

Hello, Knut and dhis2_devs, users

Do you have a data import tool written by Python and available to use?
I need the tools for the following area.
1. Existing HIS EXCEL data import to DHIS2. We want to load about several
years historical data which has been corrected into EXCEL files.
2. Import from Master Facility List (Web based Facility Master maintenance
system) to DHIS2 for facility data synchronization. This MFL has a
capability to export the data into EXCEL file. But MFL does not have a
"Short name" and need to add the data manually. Then want to import it.

If you do not have it or not available to use it, please advise me if
possible.
I believe this import capability is important for the new user of DHIS and
also could use inter system connection more widely.

Best regards,
  Shinichi Suzuki

···

-----------------------------------------------------
Shinichi Suzuki
MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
E-Mail: shin461@gmail.com
Phone: 0712-754-963
(JICA Senior Volunteer 21-4)
-----------------------------------------------------

Hi Shinichi,

You are completely right that it is important to be able to easily
import the orgunit hierarchy and historical data into DHIS2. The
general term for such operations is Extract, Transform and Load (ETL),
for which there are a number of powerful general tools available. Here
are two:

http://www.talend.com/index.php
I have personally also used ODBC connections in Access to transform
and load Excel data into Postgres directly.

A disadvantage of going directly into the database is that one looses
the validations that the DHIS2 import mechanism can perform (in
addition to the database constraints), and of course it is burdensome
to learn to use the above tools. DHIS2 has the capability to import
XML files, and thus also modern Excel files (.xlsx), and this should
probably be the common way.However, given the many different types of
data and vast range of potential data sources, we will probably never
have a simple wizard that does it all (and there is no sense in trying
to replicate Kettle or Talend).

Still, we have been thinking about defining a suitable standard format
for such import, as well as providing the user with some assistance in
transforming the data into shape for loading. We have also done some
work on extraction of data automatically from a large number of Excel
files (e.g. one or more per district) using Python. Work remains
before this work will reach a stage where it is robust and generic
enough - but I think working with the Kenyan data could help move this
process forward. I will refresh my memory on the status of the Python
work and get back to you.

Knut

···

On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello, Knut and dhis2_devs, users

Do you have a data import tool written by Python and available to use?
I need the tools for the following area.
1. Existing HIS EXCEL data import to DHIS2. We want to load about several
years historical data which has been corrected into EXCEL files.
2. Import from Master Facility List (Web based Facility Master maintenance
system) to DHIS2 for facility data synchronization. This MFL has a
capability to export the data into EXCEL file. But MFL does not have a
"Short name" and need to add the data manually. Then want to import it.

If you do not have it or not available to use it, please advise me if
possible.
I believe this import capability is important for the new user of DHIS and
also could use inter system connection more widely.

Best regards,
Shinichi Suzuki
-----------------------------------------------------
Shinichi Suzuki
MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
E-Mail: shin461@gmail.com
Phone: 0712-754-963
(JICA Senior Volunteer 21-4)
-----------------------------------------------------

--
Cheers,
Knut Staring

Hi Knut,

Thanks a lot. I am encouraged to read your mail.
I will try to read two home pages. But I feel familiar with your Access based
tools.
I know your concerning to use these direct modification.
I think DHIS2 should have a API to do this and open them. Then your concerning
will be disappeared.

Best regards,
    Shinichi Suzuki

···

-----Original Message-----
From: Knut Staring [mailto:knutst@gmail.com]
Sent: Monday, November 08, 2010 11:28 AM
To: Shinichi Suzuki
Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net
Subject: Re: Data Import tools for DHIS2

Hi Shinichi,

You are completely right that it is important to be able to easily
import the orgunit hierarchy and historical data into DHIS2. The
general term for such operations is Extract, Transform and Load (ETL),
for which there are a number of powerful general tools available. Here
are two:

http://www.talend.com/index.php
I have personally also used ODBC connections in Access to transform
and load Excel data into Postgres directly.

A disadvantage of going directly into the database is that one looses
the validations that the DHIS2 import mechanism can perform (in
addition to the database constraints), and of course it is burdensome
to learn to use the above tools. DHIS2 has the capability to import
XML files, and thus also modern Excel files (.xlsx), and this should
probably be the common way.However, given the many different types of
data and vast range of potential data sources, we will probably never
have a simple wizard that does it all (and there is no sense in trying
to replicate Kettle or Talend).

Still, we have been thinking about defining a suitable standard format
for such import, as well as providing the user with some assistance in
transforming the data into shape for loading. We have also done some
work on extraction of data automatically from a large number of Excel
files (e.g. one or more per district) using Python. Work remains
before this work will reach a stage where it is robust and generic
enough - but I think working with the Kenyan data could help move this
process forward. I will refresh my memory on the status of the Python
work and get back to you.

Knut

On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello, Knut and dhis2_devs, users

Do you have a data import tool written by Python and available to use?
I need the tools for the following area.
1. Existing HIS EXCEL data import to DHIS2. We want to load about several
years historical data which has been corrected into EXCEL files.
2. Import from Master Facility List (Web based Facility Master maintenance
system) to DHIS2 for facility data synchronization. This MFL has a
capability to export the data into EXCEL file. But MFL does not have a
"Short name" and need to add the data manually. Then want to import it.

If you do not have it or not available to use it, please advise me if
possible.
I believe this import capability is important for the new user of DHIS and
also could use inter system connection more widely.

Best regards,
Shinichi Suzuki
-----------------------------------------------------
Shinichi Suzuki
MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
E-Mail: shin461@gmail.com
Phone: 0712-754-963
(JICA Senior Volunteer 21-4)
-----------------------------------------------------

--
Cheers,
Knut Staring

Hi Shinchi,

In fact as Knut highlighted, the appropriate method would be the
production of XML data (DXF) which DHIS2 already has robust import
mechanisms for. We have used direct insertion of data in the past, but
as Knut points out, and I will again, you must be very careful when
doing this. Much better to try and get the data in the correct format,
and import it in the recommended way via DXF if you can.

I am right in the middle of migrating data from a legacy system into
DHIS. I will try and document as much of the process as I can, which
may help you in your import.

Best regards,
Jason

···

On Tue, Nov 9, 2010 at 3:35 PM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hi Knut,

Thanks a lot. I am encouraged to read your mail.
I will try to read two home pages. But I feel familiar with your Access based
tools.
I know your concerning to use these direct modification.
I think DHIS2 should have a API to do this and open them. Then your concerning
will be disappeared.

Best regards,
Shinichi Suzuki

-----Original Message-----
From: Knut Staring [mailto:knutst@gmail.com]
Sent: Monday, November 08, 2010 11:28 AM
To: Shinichi Suzuki
Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net
Subject: Re: Data Import tools for DHIS2

Hi Shinichi,

You are completely right that it is important to be able to easily
import the orgunit hierarchy and historical data into DHIS2. The
general term for such operations is Extract, Transform and Load (ETL),
for which there are a number of powerful general tools available. Here
are two:
http://kettle.pentaho.com/
http://www.talend.com/index.php
I have personally also used ODBC connections in Access to transform
and load Excel data into Postgres directly.

A disadvantage of going directly into the database is that one looses
the validations that the DHIS2 import mechanism can perform (in
addition to the database constraints), and of course it is burdensome
to learn to use the above tools. DHIS2 has the capability to import
XML files, and thus also modern Excel files (.xlsx), and this should
probably be the common way.However, given the many different types of
data and vast range of potential data sources, we will probably never
have a simple wizard that does it all (and there is no sense in trying
to replicate Kettle or Talend).

Still, we have been thinking about defining a suitable standard format
for such import, as well as providing the user with some assistance in
transforming the data into shape for loading. We have also done some
work on extraction of data automatically from a large number of Excel
files (e.g. one or more per district) using Python. Work remains
before this work will reach a stage where it is robust and generic
enough - but I think working with the Kenyan data could help move this
process forward. I will refresh my memory on the status of the Python
work and get back to you.

Knut

On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello, Knut and dhis2_devs, users

Do you have a data import tool written by Python and available to use?
I need the tools for the following area.
1. Existing HIS EXCEL data import to DHIS2. We want to load about several
years historical data which has been corrected into EXCEL files.
2. Import from Master Facility List (Web based Facility Master maintenance
system) to DHIS2 for facility data synchronization. This MFL has a
capability to export the data into EXCEL file. But MFL does not have a
"Short name" and need to add the data manually. Then want to import it.

If you do not have it or not available to use it, please advise me if
possible.
I believe this import capability is important for the new user of DHIS and
also could use inter system connection more widely.

Best regards,
Shinichi Suzuki
-----------------------------------------------------
Shinichi Suzuki
MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
E-Mail: shin461@gmail.com
Phone: 0712-754-963
(JICA Senior Volunteer 21-4)
-----------------------------------------------------

--
Cheers,
Knut Staring

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

Shinchi --
  I am migrating a legacy Access system into DHIS.
  For the org units, I am staging them in Access using a somewhat
extended version of the OrganisationUnit table, whose structure was
imported (install the Postgres or MySQL ODBC driver). I am doing a fair
amount of data cleaning, the biggest deal being making the names unique
as the old system had separate tables at each level of the hierarchy, so
could handle a facility named Kaneshi in a subdistrict named Kaneshi in
a district named Kaneshi, while another district named Kaneshie existed
in a different region. There is also the matter of determining what
facilities are still (or were ever) active, which I am doing by record
counts from the legacy system by year by facility by form/data table
(which correspond to each other in this system). The only useful org
unit attributes are type and owner, for which I have built a org unit
group set/org unit group table. Some other data like latitude and
longitude I am getting from an old site survey (I have hijacked GeoCode
to represent town); at the cost of matching the two sources of
organizational data, this other data allows me to validate some of the
data in the legacy system as well. Once the data is clean, I use Access
procedures to write SQL scripts loading the org unit groups and group
sets, then loading source, org unit and group set member. Then I go
into DHIS and do maintenance tasks to get the internal tables in synch
with the uploaded data.
  Org unit Level and period I do manually via a command line or
visual database tool.
  The next step is to define category options and option combos,
data elements, computed variables and datasets. This I do in DHIS2.
Then I link the Access staging DB to these tables to get the ids I need
to fill in data values and completedatasetregistration. Again this is
done with generated SQL scripts, one for each form. Some aggregation
takes place at this step due to differences in the organizational model
between these levels. I do a custom form for the dataset and run the
dataset report which I can compare to the reports from the legacy
system. Expect to run these import processes several times, so
eliminate old data for the same dataset and period before loading new.
  This is all quite tedious, but I do a good bit of quality
control. I am aware of Pentaho, but outside of the learning curve which
people much better than I have failed to scale, there is just no
substitute for exploratory data analysis to make sure your data is clean
before you start. And by the way, don't try to write to the database
from Access via ODBC, it's dog slow and gets slower as you add more data
until it dies altogether in midstream.
  I am willing to send you or post some of this stuff, but it's so
specific I'm afraid it will not be worth the effort, just learn the data
model and hack away.
Good luck, Roger

···

-----Original Message-----
From: dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net
[mailto:dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net] On Behalf
Of Shinichi Suzuki
Sent: Tuesday, November 09, 2010 8:36 AM
To: 'Knut Staring'
Cc: dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Data Import tools for DHIS2

Hi Knut,

Thanks a lot. I am encouraged to read your mail.
I will try to read two home pages. But I feel familiar with your Access
based
tools.
I know your concerning to use these direct modification.
I think DHIS2 should have a API to do this and open them. Then your
concerning
will be disappeared.

Best regards,
    Shinichi Suzuki

-----Original Message-----
From: Knut Staring [mailto:knutst@gmail.com]
Sent: Monday, November 08, 2010 11:28 AM
To: Shinichi Suzuki
Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net
Subject: Re: Data Import tools for DHIS2

Hi Shinichi,

You are completely right that it is important to be able to easily
import the orgunit hierarchy and historical data into DHIS2. The
general term for such operations is Extract, Transform and Load (ETL),
for which there are a number of powerful general tools available. Here
are two:

http://www.talend.com/index.php
I have personally also used ODBC connections in Access to transform
and load Excel data into Postgres directly.

A disadvantage of going directly into the database is that one looses
the validations that the DHIS2 import mechanism can perform (in
addition to the database constraints), and of course it is burdensome
to learn to use the above tools. DHIS2 has the capability to import
XML files, and thus also modern Excel files (.xlsx), and this should
probably be the common way.However, given the many different types of
data and vast range of potential data sources, we will probably never
have a simple wizard that does it all (and there is no sense in trying
to replicate Kettle or Talend).

Still, we have been thinking about defining a suitable standard format
for such import, as well as providing the user with some assistance in
transforming the data into shape for loading. We have also done some
work on extraction of data automatically from a large number of Excel
files (e.g. one or more per district) using Python. Work remains
before this work will reach a stage where it is robust and generic
enough - but I think working with the Kenyan data could help move this
process forward. I will refresh my memory on the status of the Python
work and get back to you.

Knut

On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello, Knut and dhis2_devs, users

Do you have a data import tool written by Python and available to use?
I need the tools for the following area.
1. Existing HIS EXCEL data import to DHIS2. We want to load about

several

years historical data which has been corrected into EXCEL files.
2. Import from Master Facility List (Web based Facility Master

maintenance

system) to DHIS2 for facility data synchronization. This MFL has a
capability to export the data into EXCEL file. But MFL does not have a
"Short name" and need to add the data manually. Then want to import

it.

If you do not have it or not available to use it, please advise me if
possible.
I believe this import capability is important for the new user of DHIS

and

also could use inter system connection more widely.

Best regards,
Shinichi Suzuki
-----------------------------------------------------
Shinichi Suzuki
MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
E-Mail: shin461@gmail.com
Phone: 0712-754-963
(JICA Senior Volunteer 21-4)
-----------------------------------------------------

--
Cheers,
Knut Staring

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

Hello Json,

Thanks for your answer.
I tried to export the some data and looked it by text editor to understand
XDF format.

From my web search, XDF is a AutoCAD or other 3D CAD standard format.

I could not create from Excel or CSV file to that '.action' file like.
ETL tools are looks useful for data exporting for private analysis and not
for import into DHIS2.
Please give me your recommended tool(s) name to do this.
Hopefully, could you send me your document draft when you write?
I am afraid I am confusion right now.

Many thanks again,
  Shinichi Suzuki

···

-----Original Message-----
From: Jason Pickering [mailto:jason.p.pickering@gmail.com]
Sent: Tuesday, November 09, 2010 5:25 PM
To: Shinichi Suzuki
Cc: Knut Staring; dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Data Import tools for DHIS2

Hi Shinchi,

In fact as Knut highlighted, the appropriate method would be the
production of XML data (DXF) which DHIS2 already has robust import
mechanisms for. We have used direct insertion of data in the past, but
as Knut points out, and I will again, you must be very careful when
doing this. Much better to try and get the data in the correct format,
and import it in the recommended way via DXF if you can.

I am right in the middle of migrating data from a legacy system into
DHIS. I will try and document as much of the process as I can, which
may help you in your import.

Best regards,
Jason

On Tue, Nov 9, 2010 at 3:35 PM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hi Knut,

Thanks a lot. I am encouraged to read your mail.
I will try to read two home pages. But I feel familiar with your Access

based

tools.
I know your concerning to use these direct modification.
I think DHIS2 should have a API to do this and open them. Then your

concerning

will be disappeared.

Best regards,
� �Shinichi Suzuki

-----Original Message-----
From: Knut Staring [mailto:knutst@gmail.com]
Sent: Monday, November 08, 2010 11:28 AM
To: Shinichi Suzuki
Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net
Subject: Re: Data Import tools for DHIS2

Hi Shinichi,

You are completely right that it is important to be able to easily
import the orgunit hierarchy and historical data into DHIS2. The
general term for such operations is Extract, Transform and Load (ETL),
for which there are a number of powerful general tools available. Here
are two:
http://kettle.pentaho.com/
http://www.talend.com/index.php
I have personally also used ODBC connections in Access to transform
and load Excel data into Postgres directly.

A disadvantage of going directly into the database is that one looses
the validations that the DHIS2 import mechanism can perform (in
addition to the database constraints), and of course it is burdensome
to learn to use the above tools. DHIS2 has the capability to import
XML files, and thus also modern Excel files (.xlsx), and this should
probably be the common way.However, given the many different types of
data and vast range of potential data sources, we will probably never
have a simple wizard that does it all (and there is no sense in trying
to replicate Kettle or Talend).

Still, we have been thinking about defining a suitable standard format
for such import, as well as providing the user with some assistance in
transforming the data into shape for loading. We have also done some
work on extraction of data automatically from a large number of Excel
files (e.g. one or more per district) using Python. Work remains
before this work will reach a stage where it is robust and generic
enough - but I think working with the Kenyan data could help move this
process forward. I will refresh my memory on the status of the Python
work and get back to you.

Knut

On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello, Knut and dhis2_devs, users

Do you have a data import tool written by Python and available to use?
I need the tools for the following area.
1. Existing HIS EXCEL data import to DHIS2. We want to load about several
years historical data which has been corrected into EXCEL files.
2. Import from Master Facility List (Web based Facility Master

maintenance

system) to DHIS2 for facility data synchronization. This MFL has a
capability to export the data into EXCEL file. But MFL does not have a
"Short name" and need to add the data manually. Then want to import it.

If you do not have it or not available to use it, please advise me if
possible.
I believe this import capability is important for the new user of DHIS

and

also could use inter system connection more widely.

Best regards,
�Shinichi Suzuki
-----------------------------------------------------
Shinichi Suzuki
MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
E-Mail: shin461@gmail.com
Phone: 0712-754-963
(JICA Senior Volunteer 21-4)
-----------------------------------------------------

--
Cheers,
Knut Staring

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to � � : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help � : ListHelp - Launchpad Help

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

Thanks Suzuki,

Attached is a sample district file in FTP.

Looking at data structure which is similar to what we have customized to dhis2 which capture data by health facility and FTP data is district data which form legacy data which require migration.

Regards

Baringo.zip (1.18 MB)

···

On Wed, Nov 10, 2010 at 4:18 PM, Shinichi Suzuki shin461@gmail.com wrote:

Hello Json,

Thanks for your answer.

I tried to export the some data and looked it by text editor to understand

XDF format.

From my web search, XDF is a AutoCAD or other 3D CAD standard format.

I could not create from Excel or CSV file to that ‘.action’ file like.

ETL tools are looks useful for data exporting for private analysis and not

for import into DHIS2.

Please give me your recommended tool(s) name to do this.

Hopefully, could you send me your document draft when you write?

I am afraid I am confusion right now.

Many thanks again,

Shinichi Suzuki

-----Original Message-----

From: Jason Pickering [mailto:jason.p.pickering@gmail.com]

Sent: Tuesday, November 09, 2010 5:25 PM

To: Shinichi Suzuki

Cc: Knut Staring; dhis2-users@lists.launchpad.net

Subject: Re: [Dhis2-users] Data Import tools for DHIS2

Hi Shinchi,

In fact as Knut highlighted, the appropriate method would be the

production of XML data (DXF) which DHIS2 already has robust import

mechanisms for. We have used direct insertion of data in the past, but

as Knut points out, and I will again, you must be very careful when

doing this. Much better to try and get the data in the correct format,

and import it in the recommended way via DXF if you can.

I am right in the middle of migrating data from a legacy system into

DHIS. I will try and document as much of the process as I can, which

may help you in your import.

Best regards,

Jason

On Tue, Nov 9, 2010 at 3:35 PM, Shinichi Suzuki shin461@gmail.com wrote:

Hi Knut,

Thanks a lot. I am encouraged to read your mail.

I will try to read two home pages. But I feel familiar with your Access

based

tools.

I know your concerning to use these direct modification.

I think DHIS2 should have a API to do this and open them. Then your

concerning

will be disappeared.

Best regards,

Shinichi Suzuki

-----Original Message-----

From: Knut Staring [mailto:knutst@gmail.com]

Sent: Monday, November 08, 2010 11:28 AM

To: Shinichi Suzuki

Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net

Subject: Re: Data Import tools for DHIS2

Hi Shinichi,

You are completely right that it is important to be able to easily

import the orgunit hierarchy and historical data into DHIS2. The

general term for such operations is Extract, Transform and Load (ETL),

for which there are a number of powerful general tools available. Here

are two:

http://kettle.pentaho.com/

http://www.talend.com/index.php

I have personally also used ODBC connections in Access to transform

and load Excel data into Postgres directly.

A disadvantage of going directly into the database is that one looses

the validations that the DHIS2 import mechanism can perform (in

addition to the database constraints), and of course it is burdensome

to learn to use the above tools. DHIS2 has the capability to import

XML files, and thus also modern Excel files (.xlsx), and this should

probably be the common way.However, given the many different types of

data and vast range of potential data sources, we will probably never

have a simple wizard that does it all (and there is no sense in trying

to replicate Kettle or Talend).

Still, we have been thinking about defining a suitable standard format

for such import, as well as providing the user with some assistance in

transforming the data into shape for loading. We have also done some

work on extraction of data automatically from a large number of Excel

files (e.g. one or more per district) using Python. Work remains

before this work will reach a stage where it is robust and generic

enough - but I think working with the Kenyan data could help move this

process forward. I will refresh my memory on the status of the Python

work and get back to you.

Knut

On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki shin461@gmail.com wrote:

Hello, Knut and dhis2_devs, users

Do you have a data import tool written by Python and available to use?

I need the tools for the following area.

  1. Existing HIS EXCEL data import to DHIS2. We want to load about several

years historical data which has been corrected into EXCEL files.

  1. Import from Master Facility List (Web based Facility Master

maintenance

system) to DHIS2 for facility data synchronization. This MFL has a

capability to export the data into EXCEL file. But MFL does not have a

“Short name” and need to add the data manually. Then want to import it.

If you do not have it or not available to use it, please advise me if

possible.

I believe this import capability is important for the new user of DHIS

and

also could use inter system connection more widely.

Best regards,

Shinichi Suzuki


Shinichi Suzuki

MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA

E-Mail: shin461@gmail.com

Phone: 0712-754-963

(JICA Senior Volunteer 21-4)


Cheers,

Knut Staring


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Jason P. Pickering

email: jason.p.pickering@gmail.com

tel:+260968395190


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp


Samuel Cheburet
Ministry Of Health
P.O. Box 20781
Nairobi, Kenya
Mobile- 0721624338

Hello Roger,

Thanks for your experience. It will help us a lot.
Your experience is important for me.
We have been defined all OrganisationUnit data except facility by manually.
I am started to enter facility level of OrganisationUnit data manually.
I needed more sophisticated way than manual input because we have a excel
data from other system expected regular basis for update when we are going
production run.
Also, please refer E-mail to me from Samuel Cheburet (My colleague), he
attached our sample of health information data to be import.

Thank you very much,
     Shinichi Suzuki

···

-----Original Message-----
From: Friedman, Roger (CDC/CGH/DGHA) (CTR) [mailto:rdf4@cdc.gov]
Sent: Wednesday, November 10, 2010 3:20 AM
To: Shinichi Suzuki; Knut Staring
Cc: dhis2-users@lists.launchpad.net
Subject: RE: [Dhis2-users] Data Import tools for DHIS2

Shinchi --
  I am migrating a legacy Access system into DHIS.
  For the org units, I am staging them in Access using a somewhat
extended version of the OrganisationUnit table, whose structure was
imported (install the Postgres or MySQL ODBC driver). I am doing a fair
amount of data cleaning, the biggest deal being making the names unique
as the old system had separate tables at each level of the hierarchy, so
could handle a facility named Kaneshi in a subdistrict named Kaneshi in
a district named Kaneshi, while another district named Kaneshie existed
in a different region. There is also the matter of determining what
facilities are still (or were ever) active, which I am doing by record
counts from the legacy system by year by facility by form/data table
(which correspond to each other in this system). The only useful org
unit attributes are type and owner, for which I have built a org unit
group set/org unit group table. Some other data like latitude and
longitude I am getting from an old site survey (I have hijacked GeoCode
to represent town); at the cost of matching the two sources of
organizational data, this other data allows me to validate some of the
data in the legacy system as well. Once the data is clean, I use Access
procedures to write SQL scripts loading the org unit groups and group
sets, then loading source, org unit and group set member. Then I go
into DHIS and do maintenance tasks to get the internal tables in synch
with the uploaded data.
  Org unit Level and period I do manually via a command line or
visual database tool.
  The next step is to define category options and option combos,
data elements, computed variables and datasets. This I do in DHIS2.
Then I link the Access staging DB to these tables to get the ids I need
to fill in data values and completedatasetregistration. Again this is
done with generated SQL scripts, one for each form. Some aggregation
takes place at this step due to differences in the organizational model
between these levels. I do a custom form for the dataset and run the
dataset report which I can compare to the reports from the legacy
system. Expect to run these import processes several times, so
eliminate old data for the same dataset and period before loading new.
  This is all quite tedious, but I do a good bit of quality
control. I am aware of Pentaho, but outside of the learning curve which
people much better than I have failed to scale, there is just no
substitute for exploratory data analysis to make sure your data is clean
before you start. And by the way, don't try to write to the database
from Access via ODBC, it's dog slow and gets slower as you add more data
until it dies altogether in midstream.
  I am willing to send you or post some of this stuff, but it's so
specific I'm afraid it will not be worth the effort, just learn the data
model and hack away.
Good luck, Roger

-----Original Message-----
From: dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net
[mailto:dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net] On Behalf
Of Shinichi Suzuki
Sent: Tuesday, November 09, 2010 8:36 AM
To: 'Knut Staring'
Cc: dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Data Import tools for DHIS2

Hi Knut,

Thanks a lot. I am encouraged to read your mail.
I will try to read two home pages. But I feel familiar with your Access
based
tools.
I know your concerning to use these direct modification.
I think DHIS2 should have a API to do this and open them. Then your
concerning
will be disappeared.

Best regards,
    Shinichi Suzuki

-----Original Message-----
From: Knut Staring [mailto:knutst@gmail.com]
Sent: Monday, November 08, 2010 11:28 AM
To: Shinichi Suzuki
Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net
Subject: Re: Data Import tools for DHIS2

Hi Shinichi,

You are completely right that it is important to be able to easily
import the orgunit hierarchy and historical data into DHIS2. The
general term for such operations is Extract, Transform and Load (ETL),
for which there are a number of powerful general tools available. Here
are two:

http://www.talend.com/index.php
I have personally also used ODBC connections in Access to transform
and load Excel data into Postgres directly.

A disadvantage of going directly into the database is that one looses
the validations that the DHIS2 import mechanism can perform (in
addition to the database constraints), and of course it is burdensome
to learn to use the above tools. DHIS2 has the capability to import
XML files, and thus also modern Excel files (.xlsx), and this should
probably be the common way.However, given the many different types of
data and vast range of potential data sources, we will probably never
have a simple wizard that does it all (and there is no sense in trying
to replicate Kettle or Talend).

Still, we have been thinking about defining a suitable standard format
for such import, as well as providing the user with some assistance in
transforming the data into shape for loading. We have also done some
work on extraction of data automatically from a large number of Excel
files (e.g. one or more per district) using Python. Work remains
before this work will reach a stage where it is robust and generic
enough - but I think working with the Kenyan data could help move this
process forward. I will refresh my memory on the status of the Python
work and get back to you.

Knut

On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello, Knut and dhis2_devs, users

Do you have a data import tool written by Python and available to use?
I need the tools for the following area.
1. Existing HIS EXCEL data import to DHIS2. We want to load about

several

years historical data which has been corrected into EXCEL files.
2. Import from Master Facility List (Web based Facility Master

maintenance

system) to DHIS2 for facility data synchronization. This MFL has a
capability to export the data into EXCEL file. But MFL does not have a
"Short name" and need to add the data manually. Then want to import

it.

If you do not have it or not available to use it, please advise me if
possible.
I believe this import capability is important for the new user of DHIS

and

also could use inter system connection more widely.

Best regards,
Shinichi Suzuki
-----------------------------------------------------
Shinichi Suzuki
MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
E-Mail: shin461@gmail.com
Phone: 0712-754-963
(JICA Senior Volunteer 21-4)
-----------------------------------------------------

--
Cheers,
Knut Staring

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

Hi Shinchi,
The format I was referring to is actually DXF or DHIS Exchange Format
(I think?). Anyway, it is the internal XML format used by DHIS to
transmit data between different DHIS systems, as well as for import of
data to DHIS from external sources, such as your legacy database. An
XSD of the schema is available here

http://bazaar.launchpad.net/~dhis2-documenters/dhis2/dhis2-docbook-docs/files/head%3A/src/schemas/dxf_v1_schema/

ETL tools are simply useful for the transformation of data from data
bases of one format, into different formats, such as CSV or XML (or
other database structures). I personally feel that the learning curve
that Roger mentioned is not so steep, and find Kettle to be extremely
intuitive, but this is very much a personal preference. There are many
different ways to migrate the data, conversion to XML being one of
them. It is really just a matter of personal preference using tools
that you are comfortable with.

Regards,
Jason

···

On Wed, Nov 10, 2010 at 3:18 PM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello Json,

Thanks for your answer.
I tried to export the some data and looked it by text editor to understand
XDF format.
From my web search, XDF is a AutoCAD or other 3D CAD standard format.
I could not create from Excel or CSV file to that '.action' file like.
ETL tools are looks useful for data exporting for private analysis and not
for import into DHIS2.
Please give me your recommended tool(s) name to do this.
Hopefully, could you send me your document draft when you write?
I am afraid I am confusion right now.

Many thanks again,
Shinichi Suzuki

-----Original Message-----
From: Jason Pickering [mailto:jason.p.pickering@gmail.com]
Sent: Tuesday, November 09, 2010 5:25 PM
To: Shinichi Suzuki
Cc: Knut Staring; dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Data Import tools for DHIS2

Hi Shinchi,

In fact as Knut highlighted, the appropriate method would be the
production of XML data (DXF) which DHIS2 already has robust import
mechanisms for. We have used direct insertion of data in the past, but
as Knut points out, and I will again, you must be very careful when
doing this. Much better to try and get the data in the correct format,
and import it in the recommended way via DXF if you can.

I am right in the middle of migrating data from a legacy system into
DHIS. I will try and document as much of the process as I can, which
may help you in your import.

Best regards,
Jason

On Tue, Nov 9, 2010 at 3:35 PM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hi Knut,

Thanks a lot. I am encouraged to read your mail.
I will try to read two home pages. But I feel familiar with your Access

based

tools.
I know your concerning to use these direct modification.
I think DHIS2 should have a API to do this and open them. Then your

concerning

will be disappeared.

Best regards,
Shinichi Suzuki

-----Original Message-----
From: Knut Staring [mailto:knutst@gmail.com]
Sent: Monday, November 08, 2010 11:28 AM
To: Shinichi Suzuki
Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net
Subject: Re: Data Import tools for DHIS2

Hi Shinichi,

You are completely right that it is important to be able to easily
import the orgunit hierarchy and historical data into DHIS2. The
general term for such operations is Extract, Transform and Load (ETL),
for which there are a number of powerful general tools available. Here
are two:
http://kettle.pentaho.com/
http://www.talend.com/index.php
I have personally also used ODBC connections in Access to transform
and load Excel data into Postgres directly.

A disadvantage of going directly into the database is that one looses
the validations that the DHIS2 import mechanism can perform (in
addition to the database constraints), and of course it is burdensome
to learn to use the above tools. DHIS2 has the capability to import
XML files, and thus also modern Excel files (.xlsx), and this should
probably be the common way.However, given the many different types of
data and vast range of potential data sources, we will probably never
have a simple wizard that does it all (and there is no sense in trying
to replicate Kettle or Talend).

Still, we have been thinking about defining a suitable standard format
for such import, as well as providing the user with some assistance in
transforming the data into shape for loading. We have also done some
work on extraction of data automatically from a large number of Excel
files (e.g. one or more per district) using Python. Work remains
before this work will reach a stage where it is robust and generic
enough - but I think working with the Kenyan data could help move this
process forward. I will refresh my memory on the status of the Python
work and get back to you.

Knut

On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello, Knut and dhis2_devs, users

Do you have a data import tool written by Python and available to use?
I need the tools for the following area.
1. Existing HIS EXCEL data import to DHIS2. We want to load about several
years historical data which has been corrected into EXCEL files.
2. Import from Master Facility List (Web based Facility Master

maintenance

system) to DHIS2 for facility data synchronization. This MFL has a
capability to export the data into EXCEL file. But MFL does not have a
"Short name" and need to add the data manually. Then want to import it.

If you do not have it or not available to use it, please advise me if
possible.
I believe this import capability is important for the new user of DHIS

and

also could use inter system connection more widely.

Best regards,
Shinichi Suzuki
-----------------------------------------------------
Shinichi Suzuki
MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
E-Mail: shin461@gmail.com
Phone: 0712-754-963
(JICA Senior Volunteer 21-4)
-----------------------------------------------------

--
Cheers,
Knut Staring

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

Hello again,

Let me again underline that the process of adding legacy data to DHIS2
or any information system consists of three parts - Extraction,
Transformation and Loading. I agree very much with Roger and Jason
that this process will differ for each country, and that one needs to
use a set of tools that are suited to ETL work and have thorough
knowledge of the data that is to be loaded and the model that one
wants to load into.

I do think doing a bit of manual entry of the legacy data into a test
instance of DHIS 2 (not the master copy) is a good exercise for
understanding what is needed in each particular case. There is no way
around a (manual) analytical phase, where one actively has to
look at the data available and their quality.

Still, there are certainly aspects of the process that are common to
all cases, and I think we could develop more support for those parts,
and I think the example files from Kenya can be very instructive in
this regard:

1) Extraction

There are over 20 Excel files for just Baringo district. Since you
have 149 districts, that means around 3000 Excel files in total, just
for 2009 and 2010. Each file type has its own structure, which
hopefully is quite constant across the districts. For this, I
recommend using a Python script to extract the data from the thousands
of files. I have started on such a script, but it needs a lot more
work -
and must be tailored to each of the formats.

2) Transformation

It is likely that not all the existing data will fit the new data
model well. For this cleaning stage, a database GUI such as PgAdmin
III or Access can be helpful, and I propose that the script mentioned
above would produce a CSV file for easy upload to an empty database
(not DHIS2).

There are bound to be quality issues, as Roger referred to. The data
need to go through a cleaning process, and the metadata must be
extracted on their own - i.e. Orgunit hierarchy, the Data Elements and
the dimensional breakdown (age, sex etc.). There are no ways to
automate all the cleaning and analysis, but think we could build up a
collection of useful SQL queries to be shared.

In combination with looking at the current data entry forms, this will
determine the database structure. It is vital to finalize the database
structure before starting to import the legacy data.

3) Loading

Finally, the only part I think we could standardize more is the actual
uploading to DHIS2. Data could be converted to a generic format for
import to DHIS2 the most tricky bit will likely be handling dimensions
and perhaps periods.

The best option here is probably the DHIS DXF (not the Autocad GIS
format). I think it should be feasible for the project to create a
DataLoader for DHIS2, that would allow people to format their data in
a spreadsheet or a database view, and then DHIS2 will take care of the
import. But this is a sizeable project in itself, and will have to go
into the overall prioritization of blueprints.

Knut

···

On Wed, Nov 10, 2010 at 2:52 PM, samuel cheburet <samuelcheburet@gmail.com> wrote:

Thanks Suzuki,
Attached is a sample district file in FTP.
Looking at data structure which is similar to what we have customized to
dhis2 which capture data by health facility and FTP data is district data
which form legacy data which require migration.
Regards

On Wed, Nov 10, 2010 at 4:18 PM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello Json,

Thanks for your answer.
I tried to export the some data and looked it by text editor to understand
XDF format.
>From my web search, XDF is a AutoCAD or other 3D CAD standard format.
I could not create from Excel or CSV file to that '.action' file like.
ETL tools are looks useful for data exporting for private analysis and not
for import into DHIS2.
Please give me your recommended tool(s) name to do this.
Hopefully, could you send me your document draft when you write?
I am afraid I am confusion right now.

Many thanks again,
Shinichi Suzuki

-----Original Message-----
From: Jason Pickering [mailto:jason.p.pickering@gmail.com]
Sent: Tuesday, November 09, 2010 5:25 PM
To: Shinichi Suzuki
Cc: Knut Staring; dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Data Import tools for DHIS2

Hi Shinchi,

In fact as Knut highlighted, the appropriate method would be the
production of XML data (DXF) which DHIS2 already has robust import
mechanisms for. We have used direct insertion of data in the past, but
as Knut points out, and I will again, you must be very careful when
doing this. Much better to try and get the data in the correct format,
and import it in the recommended way via DXF if you can.

I am right in the middle of migrating data from a legacy system into
DHIS. I will try and document as much of the process as I can, which
may help you in your import.

Best regards,
Jason

On Tue, Nov 9, 2010 at 3:35 PM, Shinichi Suzuki <shin461@gmail.com> wrote:
> Hi Knut,
>
> Thanks a lot. I am encouraged to read your mail.
> I will try to read two home pages. But I feel familiar with your Access
based
> tools.
> I know your concerning to use these direct modification.
> I think DHIS2 should have a API to do this and open them. Then your
concerning
> will be disappeared.
>
> Best regards,
> Shinichi Suzuki
>
> -----Original Message-----
> From: Knut Staring [mailto:knutst@gmail.com]
> Sent: Monday, November 08, 2010 11:28 AM
> To: Shinichi Suzuki
> Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net
> Subject: Re: Data Import tools for DHIS2
>
> Hi Shinichi,
>
> You are completely right that it is important to be able to easily
> import the orgunit hierarchy and historical data into DHIS2. The
> general term for such operations is Extract, Transform and Load (ETL),
> for which there are a number of powerful general tools available. Here
> are two:
> http://kettle.pentaho.com/
> http://www.talend.com/index.php
> I have personally also used ODBC connections in Access to transform
> and load Excel data into Postgres directly.
>
> A disadvantage of going directly into the database is that one looses
> the validations that the DHIS2 import mechanism can perform (in
> addition to the database constraints), and of course it is burdensome
> to learn to use the above tools. DHIS2 has the capability to import
> XML files, and thus also modern Excel files (.xlsx), and this should
> probably be the common way.However, given the many different types of
> data and vast range of potential data sources, we will probably never
> have a simple wizard that does it all (and there is no sense in trying
> to replicate Kettle or Talend).
>
> Still, we have been thinking about defining a suitable standard format
> for such import, as well as providing the user with some assistance in
> transforming the data into shape for loading. We have also done some
> work on extraction of data automatically from a large number of Excel
> files (e.g. one or more per district) using Python. Work remains
> before this work will reach a stage where it is robust and generic
> enough - but I think working with the Kenyan data could help move this
> process forward. I will refresh my memory on the status of the Python
> work and get back to you.
>
> Knut
>
>
> On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki <shin461@gmail.com> >> > wrote:
>> Hello, Knut and dhis2_devs, users
>>
>> Do you have a data import tool written by Python and available to use?
>> I need the tools for the following area.
>> 1. Existing HIS EXCEL data import to DHIS2. We want to load about
>> several
>> years historical data which has been corrected into EXCEL files.
>> 2. Import from Master Facility List (Web based Facility Master
maintenance
>> system) to DHIS2 for facility data synchronization. This MFL has a
>> capability to export the data into EXCEL file. But MFL does not have a
>> "Short name" and need to add the data manually. Then want to import it.
>>
>> If you do not have it or not available to use it, please advise me if
>> possible.
>> I believe this import capability is important for the new user of DHIS
and
>> also could use inter system connection more widely.
>>
>> Best regards,
>> Shinichi Suzuki
>> -----------------------------------------------------
>> Shinichi Suzuki
>> MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
>> E-Mail: shin461@gmail.com
>> Phone: 0712-754-963
>> (JICA Senior Volunteer 21-4)
>> -----------------------------------------------------
>>
>>
>>
>>
>
>
>
> --
> Cheers,
> Knut Staring
>
>
>
> _______________________________________________
> Mailing list: DHIS 2 Users in Launchpad
> Post to : dhis2-users@lists.launchpad.net
> Unsubscribe : DHIS 2 Users in Launchpad
> More help : ListHelp - Launchpad Help
>

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

--
Samuel Cheburet
Ministry Of Health
P.O. Box 20781
Nairobi, Kenya
Mobile- 0721624338

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

--
Cheers,
Knut Staring

Hello Jason,

Thanks a lot. I will try to learn on ETL tools.
Also, look into dhis2 documents you pointed.

Best regards,
   Shinichi Suzuki

···

-----Original Message-----
From: Jason Pickering [mailto:jason.p.pickering@gmail.com]
Sent: Wednesday, November 10, 2010 6:34 PM
To: Shinichi Suzuki
Cc: Knut Staring; dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Data Import tools for DHIS2

Hi Shinchi,
The format I was referring to is actually DXF or DHIS Exchange Format
(I think?). Anyway, it is the internal XML format used by DHIS to
transmit data between different DHIS systems, as well as for import of
data to DHIS from external sources, such as your legacy database. An
XSD of the schema is available here

http://bazaar.launchpad.net/~dhis2-documenters/dhis2/dhis2-docbook-docs/file
s/head%3A/src/schemas/dxf_v1_schema/

ETL tools are simply useful for the transformation of data from data
bases of one format, into different formats, such as CSV or XML (or
other database structures). I personally feel that the learning curve
that Roger mentioned is not so steep, and find Kettle to be extremely
intuitive, but this is very much a personal preference. There are many
different ways to migrate the data, conversion to XML being one of
them. It is really just a matter of personal preference using tools
that you are comfortable with.

Regards,
Jason

On Wed, Nov 10, 2010 at 3:18 PM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello Json,

Thanks for your answer.
I tried to export the some data and looked it by text editor to understand
XDF format.
From my web search, XDF is a AutoCAD or other 3D CAD standard format.
I could not create from Excel or CSV file to that '.action' file like.
ETL tools are looks useful for data exporting for private analysis and not
for import into DHIS2.
Please give me your recommended tool(s) name to do this.
Hopefully, could you send me your document draft when you write?
I am afraid I am confusion right now.

Many thanks again,
�Shinichi Suzuki

-----Original Message-----
From: Jason Pickering [mailto:jason.p.pickering@gmail.com]
Sent: Tuesday, November 09, 2010 5:25 PM
To: Shinichi Suzuki
Cc: Knut Staring; dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Data Import tools for DHIS2

Hi Shinchi,

In fact as Knut highlighted, the appropriate method would be the
production of XML data (DXF) which DHIS2 already has robust import
mechanisms for. We have used direct insertion of data in the past, but
as Knut points out, and I will again, you must be very careful when
doing this. Much better to try and get the data in the correct format,
and import it in the recommended way via DXF if you can.

I am right in the middle of migrating data from a legacy system into
DHIS. I will try and document as much of the process as I can, which
may help you in your import.

Best regards,
Jason

On Tue, Nov 9, 2010 at 3:35 PM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hi Knut,

Thanks a lot. I am encouraged to read your mail.
I will try to read two home pages. But I feel familiar with your Access

based

tools.
I know your concerning to use these direct modification.
I think DHIS2 should have a API to do this and open them. Then your

concerning

will be disappeared.

Best regards,
� �Shinichi Suzuki

-----Original Message-----
From: Knut Staring [mailto:knutst@gmail.com]
Sent: Monday, November 08, 2010 11:28 AM
To: Shinichi Suzuki
Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net
Subject: Re: Data Import tools for DHIS2

Hi Shinichi,

You are completely right that it is important to be able to easily
import the orgunit hierarchy and historical data into DHIS2. The
general term for such operations is Extract, Transform and Load (ETL),
for which there are a number of powerful general tools available. Here
are two:
http://kettle.pentaho.com/
http://www.talend.com/index.php
I have personally also used ODBC connections in Access to transform
and load Excel data into Postgres directly.

A disadvantage of going directly into the database is that one looses
the validations that the DHIS2 import mechanism can perform (in
addition to the database constraints), and of course it is burdensome
to learn to use the above tools. DHIS2 has the capability to import
XML files, and thus also modern Excel files (.xlsx), and this should
probably be the common way.However, given the many different types of
data and vast range of potential data sources, we will probably never
have a simple wizard that does it all (and there is no sense in trying
to replicate Kettle or Talend).

Still, we have been thinking about defining a suitable standard format
for such import, as well as providing the user with some assistance in
transforming the data into shape for loading. We have also done some
work on extraction of data automatically from a large number of Excel
files (e.g. one or more per district) using Python. Work remains
before this work will reach a stage where it is robust and generic
enough - but I think working with the Kenyan data could help move this
process forward. I will refresh my memory on the status of the Python
work and get back to you.

Knut

On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello, Knut and dhis2_devs, users

Do you have a data import tool written by Python and available to use?
I need the tools for the following area.
1. Existing HIS EXCEL data import to DHIS2. We want to load about

several

years historical data which has been corrected into EXCEL files.
2. Import from Master Facility List (Web based Facility Master

maintenance

system) to DHIS2 for facility data synchronization. This MFL has a
capability to export the data into EXCEL file. But MFL does not have a
"Short name" and need to add the data manually. Then want to import it.

If you do not have it or not available to use it, please advise me if
possible.
I believe this import capability is important for the new user of DHIS

and

also could use inter system connection more widely.

Best regards,
�Shinichi Suzuki
-----------------------------------------------------
Shinichi Suzuki
MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
E-Mail: shin461@gmail.com
Phone: 0712-754-963
(JICA Senior Volunteer 21-4)
-----------------------------------------------------

--
Cheers,
Knut Staring

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to � � : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help � : ListHelp - Launchpad Help

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

Hello all,

I completely agree with Knut. This will cover all my concerning when realized.
HIS system gathering health information data from the field. The field has
been computerization and need to transfer to DHIS system without manual
handling to do this.
This gives good advantage for DHIS users and prospects, I am sure.

Many thanks,
    Shinichi Suzuki

···

-----Original Message-----
From: Knut Staring [mailto:knutst@gmail.com]
Sent: Wednesday, November 10, 2010 7:27 PM
To: samuel cheburet
Cc: Shinichi Suzuki; dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Data Import tools for DHIS2

Hello again,

Let me again underline that the process of adding legacy data to DHIS2
or any information system consists of three parts - Extraction,
Transformation and Loading. I agree very much with Roger and Jason
that this process will differ for each country, and that one needs to
use a set of tools that are suited to ETL work and have thorough
knowledge of the data that is to be loaded and the model that one
wants to load into.

I do think doing a bit of manual entry of the legacy data into a test
instance of DHIS 2 (not the master copy) is a good exercise for
understanding what is needed in each particular case. There is no way
around a (manual) analytical phase, where one actively has to
look at the data available and their quality.

Still, there are certainly aspects of the process that are common to
all cases, and I think we could develop more support for those parts,
and I think the example files from Kenya can be very instructive in
this regard:

1) Extraction

There are over 20 Excel files for just Baringo district. Since you
have 149 districts, that means around 3000 Excel files in total, just
for 2009 and 2010. Each file type has its own structure, which
hopefully is quite constant across the districts. For this, I
recommend using a Python script to extract the data from the thousands
of files. I have started on such a script, but it needs a lot more
work -
and must be tailored to each of the formats.

2) Transformation

It is likely that not all the existing data will fit the new data
model well. For this cleaning stage, a database GUI such as PgAdmin
III or Access can be helpful, and I propose that the script mentioned
above would produce a CSV file for easy upload to an empty database
(not DHIS2).

There are bound to be quality issues, as Roger referred to. The data
need to go through a cleaning process, and the metadata must be
extracted on their own - i.e. Orgunit hierarchy, the Data Elements and
the dimensional breakdown (age, sex etc.). There are no ways to
automate all the cleaning and analysis, but think we could build up a
collection of useful SQL queries to be shared.

In combination with looking at the current data entry forms, this will
determine the database structure. It is vital to finalize the database
structure before starting to import the legacy data.

3) Loading

Finally, the only part I think we could standardize more is the actual
uploading to DHIS2. Data could be converted to a generic format for
import to DHIS2 the most tricky bit will likely be handling dimensions
and perhaps periods.

The best option here is probably the DHIS DXF (not the Autocad GIS
format). I think it should be feasible for the project to create a
DataLoader for DHIS2, that would allow people to format their data in
a spreadsheet or a database view, and then DHIS2 will take care of the
import. But this is a sizeable project in itself, and will have to go
into the overall prioritization of blueprints.

Knut

On Wed, Nov 10, 2010 at 2:52 PM, samuel cheburet <samuelcheburet@gmail.com> wrote:

Thanks Suzuki,
Attached is a sample district file in FTP.
Looking at data structure which is similar to what we have customized to
dhis2 which capture data by health facility and FTP data is district data
which form legacy data which require migration.
Regards

On Wed, Nov 10, 2010 at 4:18 PM, Shinichi Suzuki <shin461@gmail.com> wrote:

Hello Json,

Thanks for your answer.
I tried to export the some data and looked it by text editor to understand
XDF format.
>From my web search, XDF is a AutoCAD or other 3D CAD standard format.
I could not create from Excel or CSV file to that '.action' file like.
ETL tools are looks useful for data exporting for private analysis and not
for import into DHIS2.
Please give me your recommended tool(s) name to do this.
Hopefully, could you send me your document draft when you write?
I am afraid I am confusion right now.

Many thanks again,
Shinichi Suzuki

-----Original Message-----
From: Jason Pickering [mailto:jason.p.pickering@gmail.com]
Sent: Tuesday, November 09, 2010 5:25 PM
To: Shinichi Suzuki
Cc: Knut Staring; dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Data Import tools for DHIS2

Hi Shinchi,

In fact as Knut highlighted, the appropriate method would be the
production of XML data (DXF) which DHIS2 already has robust import
mechanisms for. We have used direct insertion of data in the past, but
as Knut points out, and I will again, you must be very careful when
doing this. Much better to try and get the data in the correct format,
and import it in the recommended way via DXF if you can.

I am right in the middle of migrating data from a legacy system into
DHIS. I will try and document as much of the process as I can, which
may help you in your import.

Best regards,
Jason

On Tue, Nov 9, 2010 at 3:35 PM, Shinichi Suzuki <shin461@gmail.com> wrote:
> Hi Knut,
>
> Thanks a lot. I am encouraged to read your mail.
> I will try to read two home pages. But I feel familiar with your Access
based
> tools.
> I know your concerning to use these direct modification.
> I think DHIS2 should have a API to do this and open them. Then your
concerning
> will be disappeared.
>
> Best regards,
> Shinichi Suzuki
>
> -----Original Message-----
> From: Knut Staring [mailto:knutst@gmail.com]
> Sent: Monday, November 08, 2010 11:28 AM
> To: Shinichi Suzuki
> Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net
> Subject: Re: Data Import tools for DHIS2
>
> Hi Shinichi,
>
> You are completely right that it is important to be able to easily
> import the orgunit hierarchy and historical data into DHIS2. The
> general term for such operations is Extract, Transform and Load (ETL),
> for which there are a number of powerful general tools available. Here
> are two:
> http://kettle.pentaho.com/
> http://www.talend.com/index.php
> I have personally also used ODBC connections in Access to transform
> and load Excel data into Postgres directly.
>
> A disadvantage of going directly into the database is that one looses
> the validations that the DHIS2 import mechanism can perform (in
> addition to the database constraints), and of course it is burdensome
> to learn to use the above tools. DHIS2 has the capability to import
> XML files, and thus also modern Excel files (.xlsx), and this should
> probably be the common way.However, given the many different types of
> data and vast range of potential data sources, we will probably never
> have a simple wizard that does it all (and there is no sense in trying
> to replicate Kettle or Talend).
>
> Still, we have been thinking about defining a suitable standard format
> for such import, as well as providing the user with some assistance in
> transforming the data into shape for loading. We have also done some
> work on extraction of data automatically from a large number of Excel
> files (e.g. one or more per district) using Python. Work remains
> before this work will reach a stage where it is robust and generic
> enough - but I think working with the Kenyan data could help move this
> process forward. I will refresh my memory on the status of the Python
> work and get back to you.
>
> Knut
>
>
> On Mon, Nov 8, 2010 at 8:42 AM, Shinichi Suzuki <shin461@gmail.com> >> > wrote:
>> Hello, Knut and dhis2_devs, users
>>
>> Do you have a data import tool written by Python and available to use?
>> I need the tools for the following area.
>> 1. Existing HIS EXCEL data import to DHIS2. We want to load about
>> several
>> years historical data which has been corrected into EXCEL files.
>> 2. Import from Master Facility List (Web based Facility Master
maintenance
>> system) to DHIS2 for facility data synchronization. This MFL has a
>> capability to export the data into EXCEL file. But MFL does not have a
>> "Short name" and need to add the data manually. Then want to import it.
>>
>> If you do not have it or not available to use it, please advise me if
>> possible.
>> I believe this import capability is important for the new user of DHIS
and
>> also could use inter system connection more widely.
>>
>> Best regards,
>> Shinichi Suzuki
>> -----------------------------------------------------
>> Shinichi Suzuki
>> MIS Division MOPHS: LG37 AFYA House, Nairobi, KENYA
>> E-Mail: shin461@gmail.com
>> Phone: 0712-754-963
>> (JICA Senior Volunteer 21-4)
>> -----------------------------------------------------
>>
>>
>>
>>
>
>
>
> --
> Cheers,
> Knut Staring
>
>
>
> _______________________________________________
> Mailing list: DHIS 2 Users in Launchpad
> Post to : dhis2-users@lists.launchpad.net
> Unsubscribe : DHIS 2 Users in Launchpad
> More help : ListHelp - Launchpad Help
>

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

--
Samuel Cheburet
Ministry Of Health
P.O. Box 20781
Nairobi, Kenya
Mobile- 0721624338

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

--
Cheers,
Knut Staring