Connect to data

What’s the best way to extract the DHIS data for further analysis, etc. in a statistical program like R? Is there a more automatic way to do it, or would I need to export the data manually in order to re-import it into R. I would have thought the API would do this, but it seems to only contain metadata as far as I can tell.

Thanks,

Tom

···

This message was scanned for viruses with Trend Micro ScanMail, GFI MailSecurity and GFI MailEssentials by the World Health Organization Regional Office for the Western Pacific. However, the recipient is advised to scan this e-mail and any attached files for viruses.

Disclaimer:

This e-mail, together with any attachments, is intended for the named recipients only and is confidential. It may also be privileged or otherwise protected by law.

If you have received it in error, please notify the sender immediately by reply e-mail and delete it and any attachments from your system. You may not copy or disclose its contents to anyone.

Hi again Tom,

As is the case in many tasks for R, there are numerous ways to do this. I have used the RODBC and RPostgreSQL packages of R to retrieve data from the DHIS2 database. Both of these are documented in the documentation. The trick of course is that you must write the SQL to retrieve the data into an R data frame.

Other ways might be to export the data as DXF or through the WebAPI and then parse the XML in R. CSV is another good option of course, which DHIS2 supports in various modules, such as the report tables which can export CSV directly.

Regards,

Jason

···

On Mon, May 21, 2012 at 5:50 AM, hiattt@wpro.who.int wrote:

What’s the best way to extract the DHIS data for further analysis, etc. in a statistical program like R? Is there a more automatic way to do it, or would I need to export the data manually in order to re-import it into R. I would have thought the API would do this, but it seems to only contain metadata as far as I can tell.

Thanks,

Tom


This message was scanned for viruses with Trend Micro ScanMail, GFI MailSecurity and GFI MailEssentials by the World Health Organization Regional Office for the Western Pacific. However, the recipient is advised to scan this e-mail and any attached files for viruses.

Disclaimer:

This e-mail, together with any attachments, is intended for the named recipients only and is confidential. It may also be privileged or otherwise protected by law.

If you have received it in error, please notify the sender immediately by reply e-mail and delete it and any attachments from your system. You may not copy or disclose its contents to anyone.


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Thanks, Jason.

So far I’ve only been interacting with the database through the user interface, but it sounds like if I want to pull data I need to go straight to the database tables, right? I wouldn’t mind exporting, but I’m trying to have DHIS also serve as a functional database that I can keep the data in and access directly.

Thanks again,

Tom

···

From: Jason Pickering [mailto:jason.p.pickering@gmail.com]

Sent: 21 May 2012 12:24

To: Hiatt, Mr Tom (WPRO)

Cc: dhis2-users@lists.launchpad.net

Subject: Re: [Dhis2-users] Connect to data

Hi again Tom,

As is the case in many tasks for R, there are numerous ways to do this. I have used the RODBC and RPostgreSQL packages of R to retrieve data from the DHIS2 database. Both of these are documented in the documentation. The trick of course is that you must write the SQL to retrieve the data into an R data frame.

Other ways might be to export the data as DXF or through the WebAPI and then parse the XML in R. CSV is another good option of course, which DHIS2 supports in various modules, such as the report tables which can export CSV directly.

Regards,

Jason

On Mon, May 21, 2012 at 5:50 AM, hiattt@wpro.who.int wrote:

What’s the best way to extract the DHIS data for further analysis, etc. in a statistical program like R? Is there a more automatic way to do it, or would I need to export the data manually in order to re-import it into R. I would have thought the API would do this, but it seems to only contain metadata as far as I can tell.

Thanks,

Tom


This message was scanned for viruses with Trend Micro ScanMail, GFI MailSecurity and GFI MailEssentials by the World Health Organization Regional Office for the Western Pacific. However, the recipient is advised to scan this e-mail and any attached files for viruses.

Disclaimer:

This e-mail, together with any attachments, is intended for the named recipients only and is confidential. It may also be privileged or otherwise protected by law.

If you have received it in error, please notify the sender immediately by reply e-mail and delete it and any attachments from your system. You may not copy or disclose its contents to anyone.


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp


This message was scanned for viruses with Trend Micro ScanMail, GFI MailSecurity and GFI MailEssentials by the World Health Organization Regional Office for the Western Pacific. However, the recipient is advised to scan this e-mail and any attached files for viruses.

Disclaimer:

This e-mail, together with any attachments, is intended for the named recipients only and is confidential. It may also be privileged or otherwise protected by law.

If you have received it in error, please notify the sender immediately by reply e-mail and delete it and any attachments from your system. You may not copy or disclose its contents to anyone.

Hi Tom,

if you refer to aggregated data (from the aggregated* tables) there is a way to do this - unfortunately it is in trunk only and not documented yet - will try a quick explanation here:

To get the aggregated data you can interact with the reportTables resource. First you need to look up the ids (uids) of the data elements, indicators, org units you need using the api starting here (from the snapshot demo in this example):

http://apps.dhis2.org/dev/api/resources

You can then retrieve the data like this :

http://apps.dhis2.org/dev/api/reportTables/data.xml?de=fbfJHSPpUQD&de=cYeuwXTCPkU&in=Uvn6LCg7dVU&ou=O6uvpzGd5pu&last12Months=true

You can substitute “.xml” with any of “.json”, “.jsonp”, “.csv”, “.pdf”, “.xls”, “.html”.

You can repeat the de, in and ou query params as many times you want to get data for multiple meta-data objects.

Valid params for periods are

reportingMonth

reportingBimonth

reportingQuarter

lastSixMonth

monthsThisYear

quartersThisYear

thisYear

monthsLastYear

quartersLastYear

lastYear

last5Years

last12Months

last6BiMonths

last4Quarters

last2SixMonths

thisFinancialYear

lastFinancialYear

last5FinancialYears

last52Weeks

Let me know it this helps. Lars

Thanks, Lars. This is great and very useful! Now I just have to figure out how to have R pass those parameters to the URL and then return the CSV file into R.

Thanks again,

Tom

···

From: Lars Helge Øverland [mailto:larshelge@gmail.com]

Sent: 23 May 2012 06:03

To: Hiatt, Mr Tom (WPRO)

Cc: jason.p.pickering@gmail.com;
dhis2-users@lists.launchpad.net

Subject: Re: [Dhis2-users] Connect to data

Hi Tom,

if you refer to aggregated data (from the aggregated* tables) there is a way to do this - unfortunately it is in trunk only and not documented yet - will try a quick explanation here:

To get the aggregated data you can interact with the reportTables resource. First you need to look up the ids (uids) of the data elements, indicators, org units you need using the api starting here (from the snapshot demo in this example):

http://apps.dhis2.org/dev/api/resources

You can then retrieve the data like this :

http://apps.dhis2.org/dev/api/reportTables/data.xml?de=fbfJHSPpUQD&de=cYeuwXTCPkU&in=Uvn6LCg7dVU&ou=O6uvpzGd5pu&last12Months=true

You can substitute “.xml” with any of “.json”, “.jsonp”, “.csv”, “.pdf”, “.xls”, “.html”.

You can repeat the de, in and ou query params as many times you want to get data for multiple meta-data objects.

Valid params for periods are

reportingMonth

reportingBimonth

reportingQuarter

lastSixMonth

monthsThisYear

quartersThisYear

thisYear

monthsLastYear

quartersLastYear

lastYear

last5Years

last12Months

last6BiMonths

last4Quarters

last2SixMonths

thisFinancialYear

lastFinancialYear

last5FinancialYears

last52Weeks

Let me know it this helps. Lars


This message was scanned for viruses with Trend Micro ScanMail, GFI MailSecurity and GFI MailEssentials by the World Health Organization Regional Office for the Western Pacific. However, the recipient is advised to scan this e-mail and any attached files for viruses.

Disclaimer:

This e-mail, together with any attachments, is intended for the named recipients only and is confidential. It may also be privileged or otherwise protected by law.

If you have received it in error, please notify the sender immediately by reply e-mail and delete it and any attachments from your system. You may not copy or disclose its contents to anyone.

Hi Tom and Lars,

The following R code will retrieve the data as per Lars example, and place this into an R data frame. Maybe this will help to get you started.

Best Regards,

Jason

library(RCurl)

h = basicTextGatherer()

response<-getURL("http://apps.dhis2.org/dev/api/reportTables/data.csv?de=fbfJHSPpUQD&de=cYeuwXTCPkU&in=Uvn6LCg7dVU&ou=O6uvpzGd5pu&last12Months=true

",userpwd=“admin:district”,httpauth = 1L,

header=TRUE,headerfunction=h$update)

#Ugly hack to get ride of the header

stripme<-paste(“HTTP/1.1 200 OK\r\n”,paste(h$value(NULL)[-1], collapse=“”),sep=“”)

data<-gsub(stripme,“”,response)

#Unquote the data

data<-noquote(data)

#here is the data.

foo<-read.table(textConnection(data),sep=“,”,header=T)

···

On Wed, May 23, 2012 at 5:17 AM, hiattt@wpro.who.int wrote:

Thanks, Lars. This is great and very useful! Now I just have to figure out how to have R pass those parameters to the URL and then return the CSV file into R.

Thanks again,

Tom


From: Lars Helge Øverland [mailto:larshelge@gmail.com]

Sent: 23 May 2012 06:03

To: Hiatt, Mr Tom (WPRO)

Subject: Re: [Dhis2-users] Connect to data

Cc: jason.p.pickering@gmail.com;
dhis2-users@lists.launchpad.net

Hi Tom,

if you refer to aggregated data (from the aggregated* tables) there is a way to do this - unfortunately it is in trunk only and not documented yet - will try a quick explanation here:

To get the aggregated data you can interact with the reportTables resource. First you need to look up the ids (uids) of the data elements, indicators, org units you need using the api starting here (from the snapshot demo in this example):

http://apps.dhis2.org/dev/api/resources

You can then retrieve the data like this :

http://apps.dhis2.org/dev/api/reportTables/data.xml?de=fbfJHSPpUQD&de=cYeuwXTCPkU&in=Uvn6LCg7dVU&ou=O6uvpzGd5pu&last12Months=true

You can substitute “.xml” with any of “.json”, “.jsonp”, “.csv”, “.pdf”, “.xls”, “.html”.

You can repeat the de, in and ou query params as many times you want to get data for multiple meta-data objects.

Valid params for periods are

reportingMonth

reportingBimonth

reportingQuarter

lastSixMonth

monthsThisYear

quartersThisYear

thisYear

monthsLastYear

quartersLastYear

lastYear

last5Years

last12Months

last6BiMonths

last4Quarters

last2SixMonths

thisFinancialYear

lastFinancialYear

last5FinancialYears

last52Weeks

Let me know it this helps. Lars


This message was scanned for viruses with Trend Micro ScanMail, GFI MailSecurity and GFI MailEssentials by the World Health Organization Regional Office for the Western Pacific. However, the recipient is advised to scan this e-mail and any attached files for viruses.

Disclaimer:

This e-mail, together with any attachments, is intended for the named recipients only and is confidential. It may also be privileged or otherwise protected by law.

If you have received it in error, please notify the sender immediately by reply e-mail and delete it and any attachments from your system. You may not copy or disclose its contents to anyone.

Wow, you've saved me like a week's worth of life!

I'm having proxy issues here at work, but I'll try it when I get home. I'll try and get it to work first, then take on the connection issues.

Thanks!
Tom

···

________________________________
From: Jason Pickering [mailto:jason.p.pickering@gmail.com]
Sent: 23 May 2012 12:48
To: Hiatt, Mr Tom (WPRO)
Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net
Subject: Re: [Dhis2-users] Connect to data

Hi Tom and Lars,

The following R code will retrieve the data as per Lars example, and place this into an R data frame. Maybe this will help to get you started.

Best Regards,
Jason

library(RCurl)
h = basicTextGatherer()
response<-getURL("DHIS2 App Hub
",userpwd="admin:district",httpauth = 1L,
header=TRUE,headerfunction=h$update)
#Ugly hack to get ride of the header
stripme<-paste("HTTP/1.1 200 OK\r\n",paste(h$value(NULL)[-1], collapse=""),sep="")
data<-gsub(stripme,"",response)
#Unquote the data
data<-noquote(data)
#here is the data.
foo<-read.table(textConnection(data),sep=",",header=T)

On Wed, May 23, 2012 at 5:17 AM, <hiattt@wpro.who.int<mailto:hiattt@wpro.who.int>> wrote:
Thanks, Lars. This is great and very useful! Now I just have to figure out how to have R pass those parameters to the URL and then return the CSV file into R.

Thanks again,
Tom

________________________________
From: Lars Helge Øverland [mailto:larshelge@gmail.com<mailto:larshelge@gmail.com>]
Sent: 23 May 2012 06:03

The proxy can be rather easily set by setting the curl options with
something like...

curl <- getCurlHandle()
curlSetOpt(.opts = list(proxy = '<address>:<port>'), curl = curl)
response<- getURL('http://mydhis',..., curl = curl)

Depends a bit on your proxy of course, so you may have to experiment a bit.

Regards,
Jason

···

On Wed, May 23, 2012 at 7:58 AM, <hiattt@wpro.who.int> wrote:

Wow, you’ve saved me like a week’s worth of life!

I’m having proxy issues here at work, but I’ll try it when I get home. I’ll try and get it to work first, then take on the connection issues.

Thanks!

Tom

________________________________

From: Jason Pickering [mailto:jason.p.pickering@gmail.com]
Sent: 23 May 2012 12:48

To: Hiatt, Mr Tom (WPRO)
Cc: larshelge@gmail.com; dhis2-users@lists.launchpad.net

Subject: Re: [Dhis2-users] Connect to data

Hi Tom and Lars,

The following R code will retrieve the data as per Lars example, and place this into an R data frame. Maybe this will help to get you started.

Best Regards,

Jason

library(RCurl)

h = basicTextGatherer()

response<-getURL("DHIS2 App Hub

",userpwd="admin:district",httpauth = 1L,

header=TRUE,headerfunction=h$update)

#Ugly hack to get ride of the header

stripme<-paste("HTTP/1.1 200 OK\r\n",paste(h$value(NULL)[-1], collapse=""),sep="")

data<-gsub(stripme,"",response)

#Unquote the data

data<-noquote(data)

#here is the data.

foo<-read.table(textConnection(data),sep=",",header=T)

On Wed, May 23, 2012 at 5:17 AM, <hiattt@wpro.who.int> wrote:

Thanks, Lars. This is great and very useful! Now I just have to figure out how to have R pass those parameters to the URL and then return the CSV file into R.

Thanks again,

Tom

________________________________

From: Lars Helge Øverland [mailto:larshelge@gmail.com]
Sent: 23 May 2012 06:03

To: Hiatt, Mr Tom (WPRO)

Cc: jason.p.pickering@gmail.com; dhis2-users@lists.launchpad.net

Subject: Re: [Dhis2-users] Connect to data

Hi Tom,

if you refer to aggregated data (from the aggregated* tables) there is a way to do this - unfortunately it is in trunk only and not documented yet - will try a quick explanation here:

To get the aggregated data you can interact with the reportTables resource. First you need to look up the ids (uids) of the data elements, indicators, org units you need using the api starting here (from the snapshot demo in this example):

DHIS2 App Hub

You can then retrieve the data like this :

DHIS2 App Hub

You can substitute ".xml" with any of ".json", ".jsonp", ".csv", ".pdf", ".xls", ".html".

You can repeat the de, in and ou query params as many times you want to get data for multiple meta-data objects.

Valid params for periods are

reportingMonth

reportingBimonth

reportingQuarter

lastSixMonth

monthsThisYear

quartersThisYear

thisYear

monthsLastYear

quartersLastYear

lastYear

last5Years

last12Months

last6BiMonths

last4Quarters

last2SixMonths

thisFinancialYear

lastFinancialYear

last5FinancialYears

last52Weeks

Let me know it this helps. Lars

________________________________

This message was scanned for viruses with Trend Micro ScanMail, GFI MailSecurity and GFI MailEssentials by the World Health Organization Regional Office for the Western Pacific. However, the recipient is advised to scan this e-mail and any attached files for viruses.

Disclaimer:

This e-mail, together with any attachments, is intended for the named recipients only and is confidential. It may also be privileged or otherwise protected by law.

If you have received it in error, please notify the sender immediately by reply e-mail and delete it and any attachments from your system. You may not copy or disclose its contents to anyone.

________________________________

This message was scanned for viruses with Trend Micro ScanMail, GFI MailSecurity and GFI MailEssentials by the World Health Organization Regional Office for the Western Pacific. However, the recipient is advised to scan this e-mail and any attached files for viruses.

Disclaimer:

This e-mail, together with any attachments, is intended for the named recipients only and is confidential. It may also be privileged or otherwise protected by law.

If you have received it in error, please notify the sender immediately by reply e-mail and delete it and any attachments from your system. You may not copy or disclose its contents to anyone.