using a CSV file for visualization

Greetings to the DHIS2 community. I wish to consult the group on strategy and general procedures on how to use a CSV file as a data source for DHIS2 graphs. We have a local electronic medical record in our provincial hospital that generates a CSV file with each data set representing numerical values for an indicator. While XML might be a better file format, we are generating CSV due to the need to have the data be shared to another system. I was hoping to utilize DHIS2’s graphing capability by feeding the CSV to the DHIS2 engine. Is this possible and how?

thank you very much.

···


Alison O. Perez
Software Developer
Wireless Access for Health (WAH)
Province of Tarlac, Philippines

Linkedin Profile: ph.linkedin.com/pub/alison-perez/56/44/a65/

Personal Blog: http://darthblitzkrieg.blogspot.com
ICT4D/H Blog: http://alisonperez.tk
Skype / IM: darth_ali

“Daring ideas are like chessmen moved forward; they may be beaten, but they
may start a winning game.” – Johann Wolfgang Goethe*

Hi Alison,

In Rwanda we’re doing quite a bit by importing external data into a DHIS-2 instance that we call the “Data warehouse”.

Often this comes in as CSV files that we import into a temporary table we’ve created in the DHIS and then we create a SQL script to import it into the datavalues table. This table has a lot of data integrity constraints that make it difficult to import data.

Below are a couple of tips for the 4 required fields that in combination must be unique:

  1.    In order to fill the sourceid field, be sure to use the code field in the organisationunit table to store the facility code for the reporting units from your local EMR (perhaps in this case all data are from one facility).  Then you can link to the organisationunit table to pull the appropriate organisationunitid.
    
  2.    Make sure to create the dataelements in the DHIS-2 that correspond to your indicators and that you store the dataelement id in your OpenMRS system or store the OpenMRS concept code in the dataelement code field so that you can map to the dataelementid on import.
    
  3.    Periodid: this may be a challenge for you if the data you are importing are not of any specific periodicity.  I believe the smallest reporting period in DHIS-2 is currently daily.  If your data show hourly updates of lab test values you may not be able to import them into DHIS-2.
    
  4.    Categoryoptioncomboid:  if your data does not have dimensions (i.e. M/F, age group), look up the value for “default” – often 13 or 14 and use that to fill this column.
    

Also the value stored in the DHIS-2 is actually a text field, so be sure to convert the values to strings before importing.

As always, after importing or adding new data be sure to run the analytics table process or you won’t see your new data.

Randy

···

Greetings to the DHIS2 community. I wish to consult the group on strategy and general procedures on how to use a CSV file as a data source for DHIS2 graphs. We have a local electronic medical record in our provincial hospital that generates a CSV file with each data set representing numerical values for an indicator. While XML might be a better file format, we are generating CSV due to the need to have the data be shared to another system. I was hoping to utilize DHIS2’s graphing capability by feeding the CSV to the DHIS2 engine. Is this possible and how?

thank you very much.

Alison O. Perez

Software Developer

Wireless Access for Health (WAH)

Province of Tarlac, Philippines

Linkedin Profile:
ph.linkedin.com/pub/alison-perez/56/44/a65/

Personal Blog: http://darthblitzkrieg.blogspot.com

ICT4D/H Blog: http://alisonperez.tk

Skype / IM: darth_ali

"Daring ideas are like chessmen moved forward; they may be beaten, but they

may start a winning game." – Johann Wolfgang Goethe*

Randy,

Very useful thanks.

Is there a way to visualize the datavalueset and map it to the local EMR data model (as SQL queries)? Or is everything done by hand (eg, view schema, do SQL query, parse results into XML template)…

Alvin

···

On Oct 10, 2013 10:12 PM, “Wilson,Randy” rwilson@msh.org wrote:

Hi Alison,

In Rwanda we’re doing quite a bit by importing external data into a DHIS-2 instance that we call the “Data warehouse”.

Often this comes in as CSV files that we import into a temporary table we’ve created in the DHIS and then we create a SQL script to import it into the datavalues table. This table has a lot of data integrity constraints that make it difficult to import data.

Below are a couple of tips for the 4 required fields that in combination must be unique:

  1.    In order to fill the sourceid field, be sure to use the code field in the organisationunit table to store the facility code for the reporting units from your local EMR (perhaps in this case all data are from one facility).  Then you can link to the organisationunit table to pull the appropriate organisationunitid.
    
  1.    Make sure to create the dataelements in the DHIS-2 that correspond to your indicators and that you store the dataelement id in your OpenMRS system or store the OpenMRS concept code in the dataelement code field so that you can map to the dataelementid on import.
    
  1.    Periodid: this may be a challenge for you if the data you are importing are not of any specific periodicity.  I believe the smallest reporting period in DHIS-2 is currently daily.  If your data show hourly updates of lab test values you may not be able to import them into DHIS-2.
    
  1.    Categoryoptioncomboid:  if your data does not have dimensions (i.e. M/F, age group), look up the value for “default” – often 13 or 14 and use that to fill this column.
    

Also the value stored in the DHIS-2 is actually a text field, so be sure to convert the values to strings before importing.

As always, after importing or adding new data be sure to run the analytics table process or you won’t see your new data.

Randy

From: Dhis2-users [mailto:dhis2-users-bounces+rwilson=msh.org@lists.launchpad.net] On Behalf Of Alison Perez

Sent: Wednesday, October 09, 2013 6:10 PM

To: dhis2-users@lists.launchpad.net

Subject: [Dhis2-users] using a CSV file for visualization

Greetings to the DHIS2 community. I wish to consult the group on strategy and general procedures on how to use a CSV file as a data source for DHIS2 graphs. We have a local electronic medical record in our provincial hospital that generates a CSV file with each data set representing numerical values for an indicator. While XML might be a better file format, we are generating CSV due to the need to have the data be shared to another system. I was hoping to utilize DHIS2’s graphing capability by feeding the CSV to the DHIS2 engine. Is this possible and how?

thank you very much.

Alison O. Perez

Software Developer

Wireless Access for Health (WAH)

Province of Tarlac, Philippines

Linkedin Profile:
ph.linkedin.com/pub/alison-perez/56/44/a65/

Personal Blog: http://darthblitzkrieg.blogspot.com

ICT4D/H Blog: http://alisonperez.tk

Skype / IM: darth_ali

"Daring ideas are like chessmen moved forward; they may be beaten, but they

may start a winning game." – Johann Wolfgang Goethe*


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Hi Alison

XML is not really a better file format. There’s nothing really wrong with CSV for tabular data.

Whereas we don’t always have control over the format of the CSV that we get, sometimes it is possible to process your CSV with a small script to convert it into dxf which can be imported. The benefit of doing it this way is that dhis2 will take care of the constraint violations Randy has referred to. Here’s a small example (also from Rwanda):

  1. the file format:

FOSA Code Year Month Indicator ID Value Facility Name Indicator Name

54 2013 2 181 1 ClinicA Number of new discordant couples where male partners are HIV positive

54 2013 2 182 1 ClinicA Number of new discordant couples where female partners are HIV positive

54 2013 2 183 2 ClinicA Number of new discordant couples registered during this month

54 2013 2 188 37 ClinicA Number of serodiscordant couples followed at Health Facility this month

429 2013 2 181 1 ClinicB Number of new discordant couples where male partners are HIV positive

429 2013 2 182 0 ClinicB Number of new discordant couples where female partners are HIV positive

429 2013 2 183 1 ClinicB Number of new discordant couples registered during this month

Note that these fields are actually tab separated. And, to make things more awkward, the file is UTF-16 encoded :frowning:

  1. The ‘FOSA Code’ column corresponds the code field on our orgunits in the dhis2 database. And we have dataelements in our database which we have assigned codes like ‘DE_181’, ‘DE_182’ etc corresponding to ‘Indicator ID’ column in the CSV.

  2. We can use a short script to convert this into the following valid dxf 2.0 file:

  1. this can be imported into dhis2.

The script is shown below. The requirement for iconv is a bit exotic and its just because this file comes from a system which encodes the data in UTF-16. Normally you wouldn’t need this. Otherwise its using awk to process the csv lines one by one. You might be able to do something similar with your csv file.

If you are not familiar with shell scripts or awk then this will look very geekish. But perhaps with this information you might get someone to help you with the details of the conversion script. The key thing is to have the codes mapped in your dhis2 database.

#!/bin/bash

--------------------------------------------------------------------------------------------

Convert from windoze utf16 format to ASCII, strip the ctrl-M characters and generate dxf xml

by processing csv using awk script

--------------------------------------------------------------------------------------------

iconv -f UTF-16 -t ASCII - |tr -d ‘\015’ | awk -f <(cat - <<-‘EOT’

BEGIN {

FS="\t";

printf “\n”;

}

{

if (NR!=1) {

printf("\n",$1, $2, $3, $4, $5)

}

}

END {

printf “\n”

}

EOT

) -

Regards

Bob

···

On 9 October 2013 17:10, Alison Perez perez.alison@gmail.com wrote:

Greetings to the DHIS2 community. I wish to consult the group on strategy and general procedures on how to use a CSV file as a data source for DHIS2 graphs. We have a local electronic medical record in our provincial hospital that generates a CSV file with each data set representing numerical values for an indicator. While XML might be a better file format, we are generating CSV due to the need to have the data be shared to another system. I was hoping to utilize DHIS2’s graphing capability by feeding the CSV to the DHIS2 engine. Is this possible and how?

thank you very much.


Alison O. Perez
Software Developer
Wireless Access for Health (WAH)
Province of Tarlac, Philippines

Linkedin Profile: ph.linkedin.com/pub/alison-perez/56/44/a65/

Personal Blog: http://darthblitzkrieg.blogspot.com
ICT4D/H Blog: http://alisonperez.tk
Skype / IM: darth_ali

“Daring ideas are like chessmen moved forward; they may be beaten, but they
may start a winning game.” – Johann Wolfgang Goethe*


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Thank you for all the feedback and help. I guess, I will do first the script that creates the dxf file map with the dhis2 database. Are there any mandatory datasets that should be in DXF? I was thinking that the DHIS2 engine might find some data set that is missing during file importing process.

Thank you

···

On Fri, Oct 11, 2013 at 6:47 AM, Bob Jolliffe bobjolliffe@gmail.com wrote:

Hi Alison

XML is not really a better file format. There’s nothing really wrong with CSV for tabular data.

Whereas we don’t always have control over the format of the CSV that we get, sometimes it is possible to process your CSV with a small script to convert it into dxf which can be imported. The benefit of doing it this way is that dhis2 will take care of the constraint violations Randy has referred to. Here’s a small example (also from Rwanda):

  1. the file format:

FOSA Code Year Month Indicator ID Value Facility Name Indicator Name

54 2013 2 181 1 ClinicA Number of new discordant couples where male partners are HIV positive

54 2013 2 182 1 ClinicA Number of new discordant couples where female partners are HIV positive

54 2013 2 183 2 ClinicA Number of new discordant couples registered during this month

54 2013 2 188 37 ClinicA Number of serodiscordant couples followed at Health Facility this month

429 2013 2 181 1 ClinicB Number of new discordant couples where male partners are HIV positive

429 2013 2 182 0 ClinicB Number of new discordant couples where female partners are HIV positive

429 2013 2 183 1 ClinicB Number of new discordant couples registered during this month

Note that these fields are actually tab separated. And, to make things more awkward, the file is UTF-16 encoded :frowning:

  1. The ‘FOSA Code’ column corresponds the code field on our orgunits in the dhis2 database. And we have dataelements in our database which we have assigned codes like ‘DE_181’, ‘DE_182’ etc corresponding to ‘Indicator ID’ column in the CSV.
  1. We can use a short script to convert this into the following valid dxf 2.0 file:

  1. this can be imported into dhis2.

The script is shown below. The requirement for iconv is a bit exotic and its just because this file comes from a system which encodes the data in UTF-16. Normally you wouldn’t need this. Otherwise its using awk to process the csv lines one by one. You might be able to do something similar with your csv file.

If you are not familiar with shell scripts or awk then this will look very geekish. But perhaps with this information you might get someone to help you with the details of the conversion script. The key thing is to have the codes mapped in your dhis2 database.

#!/bin/bash

--------------------------------------------------------------------------------------------

Convert from windoze utf16 format to ASCII, strip the ctrl-M characters and generate dxf xml

by processing csv using awk script

--------------------------------------------------------------------------------------------

iconv -f UTF-16 -t ASCII - |tr -d ‘\015’ | awk -f <(cat - <<-‘EOT’

BEGIN {

FS="\t";

printf “\n”;

}

{

if (NR!=1) {

printf("\n",$1, $2, $3, $4, $5)

}

}

END {

printf “\n”

}

EOT

) -

Regards

Bob


Alison O. Perez
Software Developer
Wireless Access for Health (WAH)
Province of Tarlac, Philippines

Linkedin Profile: ph.linkedin.com/pub/alison-perez/56/44/a65/

Personal Blog: http://darthblitzkrieg.blogspot.com
ICT4D/H Blog: http://alisonperez.tk
Skype / IM: darth_ali

“Daring ideas are like chessmen moved forward; they may be beaten, but they
may start a winning game.” – Johann Wolfgang Goethe*

On 9 October 2013 17:10, Alison Perez perez.alison@gmail.com wrote:

Greetings to the DHIS2 community. I wish to consult the group on strategy and general procedures on how to use a CSV file as a data source for DHIS2 graphs. We have a local electronic medical record in our provincial hospital that generates a CSV file with each data set representing numerical values for an indicator. While XML might be a better file format, we are generating CSV due to the need to have the data be shared to another system. I was hoping to utilize DHIS2’s graphing capability by feeding the CSV to the DHIS2 engine. Is this possible and how?

thank you very much.


Alison O. Perez
Software Developer
Wireless Access for Health (WAH)
Province of Tarlac, Philippines

Linkedin Profile: ph.linkedin.com/pub/alison-perez/56/44/a65/

Personal Blog: http://darthblitzkrieg.blogspot.com
ICT4D/H Blog: http://alisonperez.tk
Skype / IM: darth_ali

“Daring ideas are like chessmen moved forward; they may be beaten, but they
may start a winning game.” – Johann Wolfgang Goethe*


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Hi Alison

There are mandatory attributes in the dxf: orgUnit, period, dataElement and value. It is not necessary, but sometimes desirable to specify the dataset.

If you specify orgUnits or dataElements which don’t exist within dhis2, those values will be rejected.

Whether you will be able to produce this from your csv depends really on what your csv is like. Note the script I referred to works with this particular file, not all csv files. If you can provide a snippet of 2 or three lines from the top of the csv file (together with an explanation of the columns) I’ll have a better idea whether this is doable.

Regards

Bob

Regards

Bob

···

On 11 October 2013 02:16, Alison Perez perez.alison@gmail.com wrote:

Thank you for all the feedback and help. I guess, I will do first the script that creates the dxf file map with the dhis2 database. Are there any mandatory datasets that should be in DXF? I was thinking that the DHIS2 engine might find some data set that is missing during file importing process.

Thank you

On Fri, Oct 11, 2013 at 6:47 AM, Bob Jolliffe bobjolliffe@gmail.com wrote:

Hi Alison

XML is not really a better file format. There’s nothing really wrong with CSV for tabular data.

Whereas we don’t always have control over the format of the CSV that we get, sometimes it is possible to process your CSV with a small script to convert it into dxf which can be imported. The benefit of doing it this way is that dhis2 will take care of the constraint violations Randy has referred to. Here’s a small example (also from Rwanda):

  1. the file format:

FOSA Code Year Month Indicator ID Value Facility Name Indicator Name

54 2013 2 181 1 ClinicA Number of new discordant couples where male partners are HIV positive

54 2013 2 182 1 ClinicA Number of new discordant couples where female partners are HIV positive

54 2013 2 183 2 ClinicA Number of new discordant couples registered during this month

54 2013 2 188 37 ClinicA Number of serodiscordant couples followed at Health Facility this month

429 2013 2 181 1 ClinicB Number of new discordant couples where male partners are HIV positive

429 2013 2 182 0 ClinicB Number of new discordant couples where female partners are HIV positive

429 2013 2 183 1 ClinicB Number of new discordant couples registered during this month

Note that these fields are actually tab separated. And, to make things more awkward, the file is UTF-16 encoded :frowning:

  1. The ‘FOSA Code’ column corresponds the code field on our orgunits in the dhis2 database. And we have dataelements in our database which we have assigned codes like ‘DE_181’, ‘DE_182’ etc corresponding to ‘Indicator ID’ column in the CSV.
  1. We can use a short script to convert this into the following valid dxf 2.0 file:

  1. this can be imported into dhis2.

The script is shown below. The requirement for iconv is a bit exotic and its just because this file comes from a system which encodes the data in UTF-16. Normally you wouldn’t need this. Otherwise its using awk to process the csv lines one by one. You might be able to do something similar with your csv file.

If you are not familiar with shell scripts or awk then this will look very geekish. But perhaps with this information you might get someone to help you with the details of the conversion script. The key thing is to have the codes mapped in your dhis2 database.

#!/bin/bash

--------------------------------------------------------------------------------------------

Convert from windoze utf16 format to ASCII, strip the ctrl-M characters and generate dxf xml

by processing csv using awk script

--------------------------------------------------------------------------------------------

iconv -f UTF-16 -t ASCII - |tr -d ‘\015’ | awk -f <(cat - <<-‘EOT’

BEGIN {

FS="\t";

printf “\n”;

}

{

if (NR!=1) {

printf("\n",$1, $2, $3, $4, $5)

}

}

END {

printf “\n”

}

EOT

) -

Regards

Bob


Alison O. Perez
Software Developer
Wireless Access for Health (WAH)
Province of Tarlac, Philippines

Linkedin Profile: ph.linkedin.com/pub/alison-perez/56/44/a65/

Personal Blog: http://darthblitzkrieg.blogspot.com
ICT4D/H Blog: http://alisonperez.tk
Skype / IM: darth_ali

“Daring ideas are like chessmen moved forward; they may be beaten, but they
may start a winning game.” – Johann Wolfgang Goethe*

On 9 October 2013 17:10, Alison Perez perez.alison@gmail.com wrote:

Greetings to the DHIS2 community. I wish to consult the group on strategy and general procedures on how to use a CSV file as a data source for DHIS2 graphs. We have a local electronic medical record in our provincial hospital that generates a CSV file with each data set representing numerical values for an indicator. While XML might be a better file format, we are generating CSV due to the need to have the data be shared to another system. I was hoping to utilize DHIS2’s graphing capability by feeding the CSV to the DHIS2 engine. Is this possible and how?

thank you very much.


Alison O. Perez
Software Developer
Wireless Access for Health (WAH)
Province of Tarlac, Philippines

Linkedin Profile: ph.linkedin.com/pub/alison-perez/56/44/a65/

Personal Blog: http://darthblitzkrieg.blogspot.com
ICT4D/H Blog: http://alisonperez.tk
Skype / IM: darth_ali

“Daring ideas are like chessmen moved forward; they may be beaten, but they
may start a winning game.” – Johann Wolfgang Goethe*


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp