Hi all,
I wanted to hear if anyone has any experience with the DHS API (http://api.dhsprogram.com/#/index.html), and using it to import survey results into DHIS?
Olav
Hi all,
I wanted to hear if anyone has any experience with the DHS API (http://api.dhsprogram.com/#/index.html), and using it to import survey results into DHIS?
Olav
Not here unfortunately…just doing csv imports from DHS Excel files. Would be useful for our data warehouse.
Randy
On Jan 29, 2016 2:59 PM, “Olav Poppe” olav.poppe@me.com wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS API (http://api.dhsprogram.com/#/index.html), and using it to import survey results into DHIS?
Olav
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately.
Hi Olav & Randy,
I am currently banging on kettle (aka Pentaho DI) to extract data from a source-system (SQL-ERP in our case) into DHIS2 dataSets in json format. In our current test-scenario (2 dataElements in a dataSet with a categoryCombination of 5 categories) we are currently updating ca. 4 mio dataValues every night in a pseudo-delta mode (reading all data from source, comparing to what is there in DHIS2 already, then only pushing records for creating, updating or deleting dataValues into the api: ca. 150k per night in 1 hour, initial load was 7hrs). We still have to prove, that this is feasible when setting up the first real life dataSet where there will be more categories and more dataElements, thus exploding the number of dataValues.
Getting there was a bit painful, but now it seems to work. I chose kettle instead of Talend ETL (both open source) as it seemed to be easier to get used to. However, from a data warehouse perspective I'd prefer to have DHIS2 offering some sort of an integrated ETL landscape on the long run, which would also allow to aggregate data from tracker into dataSets, tracker to tracker, dataSets to dataSets etc.
Our current version of the kettle transformations and jobs were designed to be generic (not for a specific dataSet, but you have to design your own extractor which could be a simple csv-reader or maybe a DHS api-call). If you are interested, I will share them. Just be aware that they are currently in a very early and rough state and not documented. You'd have to bring along the willingness to dig yourself into kettle and be pain resistant to a certain degree
I'd be interested to hear from other experiences ...
Have a nice sunday,
Uwe
---
Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
Not here unfortunately...just doing csv imports from DHS Excel files. Would be useful for our data warehouse.
RandyOn Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com > <mailto:olav.poppe@me.com>> wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS API
(The DHS Program API), and using it to import
survey results into DHIS?Olav
_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
Post to : dhis2-users@lists.launchpad.net
<mailto:dhis2-users@lists.launchpad.net>
Unsubscribe : https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
More help : https://help.launchpad.net/ListHelp/This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately./
_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Hi Randy and Uwe,
thanks, interesting to hear you experiences. Uwe, what you are working on sounds quite a bit more complicated, and not least with far more data. I image that with household surveys, it would be a matter of < 100 indicators for < 200 orgunits for 2-3 periods, i.e. a fraction of what you are dealing with!
Olav
Hi Olav & Randy,
I am currently banging on kettle (aka Pentaho DI) to extract data from a source-system (SQL-ERP in our case) into DHIS2 dataSets in json format. In our current test-scenario (2 dataElements in a dataSet with a categoryCombination of 5 categories) we are currently updating ca. 4 mio dataValues every night in a pseudo-delta mode (reading all data from source, comparing to what is there in DHIS2 already, then only pushing records for creating, updating or deleting dataValues into the api: ca. 150k per night in 1 hour, initial load was 7hrs). We still have to prove, that this is feasible when setting up the first real life dataSet where there will be more categories and more dataElements, thus exploding the number of dataValues.
Getting there was a bit painful, but now it seems to work. I chose kettle instead of Talend ETL (both open source) as it seemed to be easier to get used to. However, from a data warehouse perspective I'd prefer to have DHIS2 offering some sort of an integrated ETL landscape on the long run, which would also allow to aggregate data from tracker into dataSets, tracker to tracker, dataSets to dataSets etc.
Our current version of the kettle transformations and jobs were designed to be generic (not for a specific dataSet, but you have to design your own extractor which could be a simple csv-reader or maybe a DHS api-call). If you are interested, I will share them. Just be aware that they are currently in a very early and rough state and not documented. You'd have to bring along the willingness to dig yourself into kettle and be pain resistant to a certain degree :-)
I'd be interested to hear from other experiences ...
Have a nice sunday,
Uwe
---
Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
Not here unfortunately...just doing csv imports from DHS Excel files. Would be useful for our data warehouse. Randy
On Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com > wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS API (http://api.dhsprogram.com/#/index.html ), and using it to import survey results into DHIS?
Olav
_______________________________________________ Mailing list: [https://launchpad.net/~dhis2-users](https://launchpad.net/%7Edhis2-users) Post to : dhis2-users@lists.launchpad.net Unsubscribe : [https://launchpad.net/~dhis2-users](https://launchpad.net/%7Edhis2-users) More help : [https://help.launchpad.net/ListHelp](https://help.launchpad.net/ListHelp)
* This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately.*
_______________________________________________ Mailing list: Post to : Unsubscribe : More help :
https://launchpad.net/~dhis2-usersdhis2-users@lists.launchpad.nethttps://launchpad.net/~dhis2-usershttps://help.launchpad.net/ListHelp
Dear Uwe,
Have you tried to send data via the endpoint api/dataValueSets, it may be faster. Just stage your data and push it once.
http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
Also to note, is how you send it, I have seen curl taking ages to submit individual values via the api. You need to send it as once file via once request or implement concurrency.
Alex
On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe olav.poppe@me.com wrote:
Hi Randy and Uwe,
thanks, interesting to hear you experiences. Uwe, what you are working on sounds quite a bit more complicated, and not least with far more data. I image that with household surveys, it would be a matter of < 100 indicators for < 200 orgunits for 2-3 periods, i.e. a fraction of what you are dealing with!
Olav
- jan. 2016 kl. 09.29 skrev uwe wahser uwe@wahser.de:
Hi Olav & Randy,
I am currently banging on kettle (aka Pentaho DI) to extract data from a source-system (SQL-ERP in our case) into DHIS2 dataSets in json format. In our current test-scenario (2 dataElements in a dataSet with a categoryCombination of 5 categories) we are currently updating ca. 4 mio dataValues every night in a pseudo-delta mode (reading all data from source, comparing to what is there in DHIS2 already, then only pushing records for creating, updating or deleting dataValues into the api: ca. 150k per night in 1 hour, initial load was 7hrs). We still have to prove, that this is feasible when setting up the first real life dataSet where there will be more categories and more dataElements, thus exploding the number of dataValues. Getting there was a bit painful, but now it seems to work. I chose kettle instead of Talend ETL (both open source) as it seemed to be easier to get used to. However, from a data warehouse perspective I'd prefer to have DHIS2 offering some sort of an integrated ETL landscape on the long run, which would also allow to aggregate data from tracker into dataSets, tracker to tracker, dataSets to dataSets etc. Our current version of the kettle transformations and jobs were designed to be generic (not for a specific dataSet, but you have to design your own extractor which could be a simple csv-reader or maybe a DHS api-call). If you are interested, I will share them. Just be aware that they are currently in a very early and rough state and not documented. You'd have to bring along the willingness to dig yourself into kettle and be pain resistant to a certain degree :-) I'd be interested to hear from other experiences ... Have a nice sunday, Uwe --- Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
Not here unfortunately...just doing csv imports from DHS Excel files. Would be useful for our data warehouse. Randy
On Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com > wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS API (http://api.dhsprogram.com/#/index.html ), and using it to import survey results into DHIS?
Olav
_______________________________________________ Mailing list: [https://launchpad.net/~dhis2-users](https://launchpad.net/%7Edhis2-users) Post to : dhis2-users@lists.launchpad.net Unsubscribe : [https://launchpad.net/~dhis2-users](https://launchpad.net/%7Edhis2-users) More help : [https://help.launchpad.net/ListHelp](https://help.launchpad.net/ListHelp)
* This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately.*
_______________________________________________ Mailing list: [https://launchpad.net/~dhis2-users](https://launchpad.net/~dhis2-users) Post to : dhis2-users@lists.launchpad.net Unsubscribe : [https://launchpad.net/~dhis2-users](https://launchpad.net/~dhis2-users) More help : [https://help.launchpad.net/ListHelp](https://help.launchpad.net/ListHelp)
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Alex Tumwesigye
Technical Advisor - DHIS2 (Consultant),
Ministry of Health/AFENET
Kampala
Uganda
IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
IT Specialist (Servers, Networks and Security, Health Information Systems - DHIS2 ) & Solar Consultant
+256 774149 775, + 256 759 800161
"I don’t want to be anything other than what I have been - one tree hill "
One of the interesting ideas from Uwe’s approach is that DHS has apparently standardized definitions for all indicators - presumably there is a code that we can use in DHIS-2 so that interoperability will be simplified. Uwe might want to extend the data element attributes to capture more of the metadata that is available in DHS to define the indicators. Also, I wonder if you plan to bring in the raw data (numerators & denominators) as data elements and build the calculations into DHIS-2, or bring in the calculated indicator values as data elements.
One of the challenges that we face in our Data Warehouse is that it contains indicators calculated based on both routine and population survey data. We have to be very careful of the indicator names so that people know which come from which source. For example: from DHS we have “Contraceptive prevalence rate - modern methods” while we estimate that from the routine HMIS data but call it “Contraceptive utilisation rate from health facilities - modern methods”.
Randy
This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately.
On Tue, Feb 2, 2016 at 4:13 PM, Olav Poppe olav.poppe@me.com wrote:
Hi Randy and Uwe,
thanks, interesting to hear you experiences. Uwe, what you are working on sounds quite a bit more complicated, and not least with far more data. I image that with household surveys, it would be a matter of < 100 indicators for < 200 orgunits for 2-3 periods, i.e. a fraction of what you are dealing with!
Olav
- jan. 2016 kl. 09.29 skrev uwe wahser uwe@wahser.de:
Hi Olav & Randy,
I am currently banging on kettle (aka Pentaho DI) to extract data from a source-system (SQL-ERP in our case) into DHIS2 dataSets in json format. In our current test-scenario (2 dataElements in a dataSet with a categoryCombination of 5 categories) we are currently updating ca. 4 mio dataValues every night in a pseudo-delta mode (reading all data from source, comparing to what is there in DHIS2 already, then only pushing records for creating, updating or deleting dataValues into the api: ca. 150k per night in 1 hour, initial load was 7hrs). We still have to prove, that this is feasible when setting up the first real life dataSet where there will be more categories and more dataElements, thus exploding the number of dataValues. Getting there was a bit painful, but now it seems to work. I chose kettle instead of Talend ETL (both open source) as it seemed to be easier to get used to. However, from a data warehouse perspective I'd prefer to have DHIS2 offering some sort of an integrated ETL landscape on the long run, which would also allow to aggregate data from tracker into dataSets, tracker to tracker, dataSets to dataSets etc. Our current version of the kettle transformations and jobs were designed to be generic (not for a specific dataSet, but you have to design your own extractor which could be a simple csv-reader or maybe a DHS api-call). If you are interested, I will share them. Just be aware that they are currently in a very early and rough state and not documented. You'd have to bring along the willingness to dig yourself into kettle and be pain resistant to a certain degree :-) I'd be interested to hear from other experiences ... Have a nice sunday, Uwe --- Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
Not here unfortunately...just doing csv imports from DHS Excel files. Would be useful for our data warehouse. Randy
On Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com > wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS API (http://api.dhsprogram.com/#/index.html ), and using it to import survey results into DHIS?
Olav
_______________________________________________ Mailing list: [https://launchpad.net/~dhis2-users](https://launchpad.net/%7Edhis2-users) Post to : dhis2-users@lists.launchpad.net Unsubscribe : [https://launchpad.net/~dhis2-users](https://launchpad.net/%7Edhis2-users) More help : [https://help.launchpad.net/ListHelp](https://help.launchpad.net/ListHelp)
* This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately.*
_______________________________________________ Mailing list: [https://launchpad.net/~dhis2-users](https://launchpad.net/~dhis2-users) Post to : dhis2-users@lists.launchpad.net Unsubscribe : [https://launchpad.net/~dhis2-users](https://launchpad.net/~dhis2-users) More help : [https://help.launchpad.net/ListHelp](https://help.launchpad.net/ListHelp)
–
Randy Wilson
*Team Leader: *Knowledge Management, Data Use and Research
Rwanda Health System Strengthening Activity
Management Sciences for Health
Rwanda-Kigali
Direct: +250 788308835
E-mail: rwilson@msh.org
Skype: wilsonrandy_us
Stronger health systems. Greater health impact.
Hi Olav,
I have not worked with the DHS API per se, but have imported lots of data using the same approach which they outline here (http://api.dhsprogram.com/#/samples-r.cfm)
I have written up a walkthrough of getting data out of one DHIS instance and into another one, and I think the basic principles would be the same (http://rpubs.com/jason_p_pickering/139589)
Metadata needs to be mapped (or created), the data needs to be reshaped, and correctly formatted.
It should not be too difficult. I used R, but there are other examples with Python and JavaScript on their examples page.
Regards,
Jason
On Tue, Feb 2, 2016 at 3:31 PM, Alex Tumwesigye atumwesigye@gmail.com wrote:
Dear Uwe,
Have you tried to send data via the endpoint api/dataValueSets, it may be faster. Just stage your data and push it once.
http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
Also to note, is how you send it, I have seen curl taking ages to submit individual values via the api. You need to send it as once file via once request or implement concurrency.
Alex
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe olav.poppe@me.com wrote:
Hi Randy and Uwe,
thanks, interesting to hear you experiences. Uwe, what you are working on sounds quite a bit more complicated, and not least with far more data. I image that with household surveys, it would be a matter of < 100 indicators for < 200 orgunits for 2-3 periods, i.e. a fraction of what you are dealing with!
Olav
- jan. 2016 kl. 09.29 skrev uwe wahser uwe@wahser.de:
Hi Olav & Randy,
I am currently banging on kettle (aka Pentaho DI) to extract data from a source-system (SQL-ERP in our case) into DHIS2 dataSets in json format. In our current test-scenario (2 dataElements in a dataSet with a categoryCombination of 5 categories) we are currently updating ca. 4 mio dataValues every night in a pseudo-delta mode (reading all data from source, comparing to what is there in DHIS2 already, then only pushing records for creating, updating or deleting dataValues into the api: ca. 150k per night in 1 hour, initial load was 7hrs). We still have to prove, that this is feasible when setting up the first real life dataSet where there will be more categories and more dataElements, thus exploding the number of dataValues. Getting there was a bit painful, but now it seems to work. I chose kettle instead of Talend ETL (both open source) as it seemed to be easier to get used to. However, from a data warehouse perspective I'd prefer to have DHIS2 offering some sort of an integrated ETL landscape on the long run, which would also allow to aggregate data from tracker into dataSets, tracker to tracker, dataSets to dataSets etc. Our current version of the kettle transformations and jobs were designed to be generic (not for a specific dataSet, but you have to design your own extractor which could be a simple csv-reader or maybe a DHS api-call). If you are interested, I will share them. Just be aware that they are currently in a very early and rough state and not documented. You'd have to bring along the willingness to dig yourself into kettle and be pain resistant to a certain degree :-) I'd be interested to hear from other experiences ... Have a nice sunday, Uwe --- Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
Not here unfortunately...just doing csv imports from DHS Excel files. Would be useful for our data warehouse. Randy
On Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com > wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS API (http://api.dhsprogram.com/#/index.html ), and using it to import survey results into DHIS?
Olav
_______________________________________________ Mailing list: [https://launchpad.net/~dhis2-users](https://launchpad.net/%7Edhis2-users) Post to : dhis2-users@lists.launchpad.net Unsubscribe : [https://launchpad.net/~dhis2-users](https://launchpad.net/%7Edhis2-users) More help : [https://help.launchpad.net/ListHelp](https://help.launchpad.net/ListHelp)
* This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately.*
_______________________________________________ Mailing list: [https://launchpad.net/~dhis2-users](https://launchpad.net/~dhis2-users) Post to : dhis2-users@lists.launchpad.net Unsubscribe : [https://launchpad.net/~dhis2-users](https://launchpad.net/~dhis2-users) More help : [https://help.launchpad.net/ListHelp](https://help.launchpad.net/ListHelp)
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Alex Tumwesigye
Technical Advisor - DHIS2 (Consultant),
Ministry of Health/AFENET
Kampala
Uganda
IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
IT Specialist (Servers, Networks and Security, Health Information Systems - DHIS2 ) & Solar Consultant
+256 774149 775, + 256 759 800161
"I don’t want to be anything other than what I have been - one tree hill "
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+46764147049
Hi Alex,
thanks for the suggestions. That's actually the api I am using: Per dataSet I
post one request for deletion, one for creation and one for update in parallel.
Kettle has a transformation for converting tabular data into one json record and
another one for POSTing that json-chunk to the api in one request. I also saw
your curl-observation when sending single values in the beginning, when there
wasn't a DELETE option for the batch and I had to delete on a single
record-basis.
Actually I was surprised that the performance of the api is rather acceptable:
on our server it's roughly 375k records per hour for creating/updating/deleting
(no network delays since kettle is running on the same server as DHIS2 thus
POSTing to localhost). But I am thinking of breaking the load into parallel
packages as you suggested e.g. per dataElement, mainly in order to avoid memory
dumps from kettle - the json-converter is quite hungry. Is DHIS2 able to detect
memory shortages from parallel api-imports without dumping?
Does anyone have experience with more permanent options, like posting CSV to
dataValueSets or using the new ADX api? Actually I'd prefer DHIS2 offering an
api where I can POST a CSV-like structure per dataSet like
[ou,pe,Category1,Category2,etc,DataElement1,DataElement2,etc]. I suppose that
this would reduce the volume of data to be transferred significantly, not sure
about the performance.
Regards,
Uwe
---
Alex Tumwesigye <atumwesigye@gmail.com> hat am 2. Februar 2016 um 17:31
geschrieben:Dear Uwe,
Have you tried to send data via the endpoint api/dataValueSets, it may be
faster. Just stage your data and push it once.
http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372Also to note, is how you send it, I have seen curl taking ages to submit
individual values via the api. You need to send it as once file via once
request or implement concurrency.Alex
On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe <olav.poppe@me.com> wrote:
> Hi Randy and Uwe,
> thanks, interesting to hear you experiences. Uwe, what you are working on
> sounds quite a bit more complicated, and not least with far more data. I
> image that with household surveys, it would be a matter of < 100 indicators
> for < 200 orgunits for 2-3 periods, i.e. a fraction of what you are dealing
> with!
>
> Olav
>
>
>
>
>
>
> 31. jan. 2016 kl. 09.29 skrev uwe wahser <uwe@wahser.de>:
>
> Hi Olav & Randy,
>
> I am currently banging on kettle (aka Pentaho DI) to extract data from a
> source-system (SQL-ERP in our case) into DHIS2 dataSets in json format. In
> our current test-scenario (2 dataElements in a dataSet with a
> categoryCombination of 5 categories) we are currently updating ca. 4 mio
> dataValues every night in a pseudo-delta mode (reading all data from
> source, comparing to what is there in DHIS2 already, then only pushing
> records for creating, updating or deleting dataValues into the api: ca.
> 150k per night in 1 hour, initial load was 7hrs). We still have to prove,
> that this is feasible when setting up the first real life dataSet where
> there will be more categories and more dataElements, thus exploding the
> number of dataValues.
>
> Getting there was a bit painful, but now it seems to work. I chose kettle
> instead of Talend ETL (both open source) as it seemed to be easier to get
> used to. However, from a data warehouse perspective I'd prefer to have
> DHIS2 offering some sort of an integrated ETL landscape on the long run,
> which would also allow to aggregate data from tracker into dataSets,
> tracker to tracker, dataSets to dataSets etc.
>
> Our current version of the kettle transformations and jobs were designed
> to be generic (not for a specific dataSet, but you have to design your own
> extractor which could be a simple csv-reader or maybe a DHS api-call). If
> you are interested, I will share them. Just be aware that they are
> currently in a very early and rough state and not documented. You'd have to
> bring along the willingness to dig yourself into kettle and be pain
> resistant to a certain degree
>
> I'd be interested to hear from other experiences ...
>
> Have a nice sunday,
>
> Uwe
>
> ---
>
> Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
>
> Not here unfortunately...just doing csv imports from DHS Excel files.
> Would be useful for our data warehouse.
> Randy
> On Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com> wrote:
>
>> Hi all,
>> I wanted to hear if anyone has any experience with the DHS API (
>> The DHS Program API), and using it to import survey
>> results into DHIS?
>>
>> Olav
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-users
>> Post to : dhis2-users@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~dhis2-users
>> More help : https://help.launchpad.net/ListHelp
>>
>>
> *This message and its attachments are confidential and solely for the
> intended recipients. If received in error, please delete them and notify
> the sender via reply e-mail immediately.*
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to : dhis2-users@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help : https://help.launchpad.net/ListHelp
>
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to : dhis2-users@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help : https://help.launchpad.net/ListHelp
>
>--
Alex TumwesigyeTechnical Advisor - DHIS2 (Consultant),
Ministry of Health/AFENET
Kampala
UgandaIT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
IT Specialist (Servers, Networks and Security, Health Information Systems -
DHIS2 ) & Solar Consultant+256 774149 775, + 256 759 800161
"I don't want to be anything other than what I have been - one tree hill "
Hi Jason,
thanks for sharing the links. As I can see on a quick glance, you are also
experimenting with the ADX-api - did you observe any significant performance
differences between ADX and dataValueSets apis?
Regards,
Uwe
Jason Pickering <jason.p.pickering@gmail.com> hat am 2. Februar 2016 um 18:21
geschrieben:Hi Olav,
I have not worked with the DHS API per se, but have imported lots of data
using the same approach which they outline here (
The DHS Program API)I have written up a walkthrough of getting data out of one DHIS instance
and into another one, and I think the basic principles would be the same (
http://rpubs.com/jason_p_pickering/139589\)Metadata needs to be mapped (or created), the data needs to be reshaped,
and correctly formatted.It should not be too difficult. I used R, but there are other examples with
Python and JavaScript on their examples page.Regards,
JasonOn Tue, Feb 2, 2016 at 3:31 PM, Alex Tumwesigye <atumwesigye@gmail.com> > wrote:
> Dear Uwe,
>
> Have you tried to send data via the endpoint api/dataValueSets, it may be
> faster. Just stage your data and push it once.
>
> http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
>
> Also to note, is how you send it, I have seen curl taking ages to submit
> individual values via the api. You need to send it as once file via once
> request or implement concurrency.
>
> Alex
>
> On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe <olav.poppe@me.com> wrote:
>
>> Hi Randy and Uwe,
>> thanks, interesting to hear you experiences. Uwe, what you are working on
>> sounds quite a bit more complicated, and not least with far more data. I
>> image that with household surveys, it would be a matter of < 100 indicators
>> for < 200 orgunits for 2-3 periods, i.e. a fraction of what you are dealing
>> with!
>>
>> Olav
>>
>>
>>
>>
>>
>>
>> 31. jan. 2016 kl. 09.29 skrev uwe wahser <uwe@wahser.de>:
>>
>> Hi Olav & Randy,
>>
>> I am currently banging on kettle (aka Pentaho DI) to extract data from a
>> source-system (SQL-ERP in our case) into DHIS2 dataSets in json format. In
>> our current test-scenario (2 dataElements in a dataSet with a
>> categoryCombination of 5 categories) we are currently updating ca. 4 mio
>> dataValues every night in a pseudo-delta mode (reading all data from
>> source, comparing to what is there in DHIS2 already, then only pushing
>> records for creating, updating or deleting dataValues into the api: ca.
>> 150k per night in 1 hour, initial load was 7hrs). We still have to prove,
>> that this is feasible when setting up the first real life dataSet where
>> there will be more categories and more dataElements, thus exploding the
>> number of dataValues.
>>
>> Getting there was a bit painful, but now it seems to work. I chose kettle
>> instead of Talend ETL (both open source) as it seemed to be easier to get
>> used to. However, from a data warehouse perspective I'd prefer to have
>> DHIS2 offering some sort of an integrated ETL landscape on the long run,
>> which would also allow to aggregate data from tracker into dataSets,
>> tracker to tracker, dataSets to dataSets etc.
>>
>> Our current version of the kettle transformations and jobs were designed
>> to be generic (not for a specific dataSet, but you have to design your own
>> extractor which could be a simple csv-reader or maybe a DHS api-call). If
>> you are interested, I will share them. Just be aware that they are
>> currently in a very early and rough state and not documented. You'd have to
>> bring along the willingness to dig yourself into kettle and be pain
>> resistant to a certain degree
>>
>> I'd be interested to hear from other experiences ...
>>
>> Have a nice sunday,
>>
>> Uwe
>>
>> ---
>>
>> Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
>>
>> Not here unfortunately...just doing csv imports from DHS Excel files.
>> Would be useful for our data warehouse.
>> Randy
>> On Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com> wrote:
>>
>>> Hi all,
>>> I wanted to hear if anyone has any experience with the DHS API (
>>> The DHS Program API), and using it to import survey
>>> results into DHIS?
>>>
>>> Olav
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~dhis2-users
>>> Post to : dhis2-users@lists.launchpad.net
>>> Unsubscribe : https://launchpad.net/~dhis2-users
>>> More help : https://help.launchpad.net/ListHelp
>>>
>>>
>> *This message and its attachments are confidential and solely for the
>> intended recipients. If received in error, please delete them and notify
>> the sender via reply e-mail immediately.*
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-users
>> Post to : dhis2-users@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~dhis2-users
>> More help : https://help.launchpad.net/ListHelp
>>
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-users
>> Post to : dhis2-users@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~dhis2-users
>> More help : https://help.launchpad.net/ListHelp
>>
>>
>
>
> --
> Alex Tumwesigye
>
> Technical Advisor - DHIS2 (Consultant),
> Ministry of Health/AFENET
> Kampala
> Uganda
>
> IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
>
> IT Specialist (Servers, Networks and Security, Health Information Systems
> - DHIS2 ) & Solar Consultant
>
> +256 774149 775, + 256 759 800161
>
> "I don't want to be anything other than what I have been - one tree hill "
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to : dhis2-users@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help : https://help.launchpad.net/ListHelp
>
>--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+46764147049
_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
This was a very trivial lab test,so not really conclusive at all. I would just give it a try and see. If you see differences, please let the devs know.
Given the scale of what you are attempting, have you considered using direct SQL injection? Not that I am recommending that route as there are many pitfalls, but it might be an option if implemented properly, especially considering your reported architecture.
Regards
Jason
On Tue, Feb 2, 2016, 17:04 Uwe Wahser uwe@wahser.de wrote:
Hi Jason,
thanks for sharing the links. As I can see on a quick glance, you are also
experimenting with the ADX-api - did you observe any significant performance
differences between ADX and dataValueSets apis?
Regards,
Uwe
Jason Pickering jason.p.pickering@gmail.com hat am 2. Februar 2016 um 18:21
geschrieben:
Hi Olav,
I have not worked with the DHS API per se, but have imported lots of data
using the same approach which they outline here (
I have written up a walkthrough of getting data out of one DHIS instance
and into another one, and I think the basic principles would be the same (
Metadata needs to be mapped (or created), the data needs to be reshaped,
and correctly formatted.
It should not be too difficult. I used R, but there are other examples with
Python and JavaScript on their examples page.
Regards,
Jason
On Tue, Feb 2, 2016 at 3:31 PM, Alex Tumwesigye atumwesigye@gmail.com > > > wrote:
Dear Uwe,
Have you tried to send data via the endpoint api/dataValueSets, it may be
faster. Just stage your data and push it once.
http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
Also to note, is how you send it, I have seen curl taking ages to submit
individual values via the api. You need to send it as once file via once
request or implement concurrency.
Alex
On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe olav.poppe@me.com wrote:
Hi Randy and Uwe,
thanks, interesting to hear you experiences. Uwe, what you are working on
sounds quite a bit more complicated, and not least with far more data. I
image that with household surveys, it would be a matter of < 100 indicators
for < 200 orgunits for 2-3 periods, i.e. a fraction of what you are dealing
with!
Olav
- jan. 2016 kl. 09.29 skrev uwe wahser uwe@wahser.de:
Hi Olav & Randy,
I am currently banging on kettle (aka Pentaho DI) to extract data from a
source-system (SQL-ERP in our case) into DHIS2 dataSets in json format. In
our current test-scenario (2 dataElements in a dataSet with a
categoryCombination of 5 categories) we are currently updating ca. 4 mio
dataValues every night in a pseudo-delta mode (reading all data from
source, comparing to what is there in DHIS2 already, then only pushing
records for creating, updating or deleting dataValues into the api: ca.
150k per night in 1 hour, initial load was 7hrs). We still have to prove,
that this is feasible when setting up the first real life dataSet where
there will be more categories and more dataElements, thus exploding the
number of dataValues.
Getting there was a bit painful, but now it seems to work. I chose kettle
instead of Talend ETL (both open source) as it seemed to be easier to get
used to. However, from a data warehouse perspective I’d prefer to have
DHIS2 offering some sort of an integrated ETL landscape on the long run,
which would also allow to aggregate data from tracker into dataSets,
tracker to tracker, dataSets to dataSets etc.
Our current version of the kettle transformations and jobs were designed
to be generic (not for a specific dataSet, but you have to design your own
extractor which could be a simple csv-reader or maybe a DHS api-call). If
you are interested, I will share them. Just be aware that they are
currently in a very early and rough state and not documented. You’d have to
bring along the willingness to dig yourself into kettle and be pain
resistant to a certain degree
I’d be interested to hear from other experiences …
Have a nice sunday,
Uwe
Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
Not here unfortunately…just doing csv imports from DHS Excel files.
Would be useful for our data warehouse.
Randy
On Jan 29, 2016 2:59 PM, “Olav Poppe” olav.poppe@me.com wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS API (
http://api.dhsprogram.com/#/index.html), and using it to import survey
results into DHIS?
Olav
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
*This message and its attachments are confidential and solely for the
intended recipients. If received in error, please delete them and notify
the sender via reply e-mail immediately.*
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Alex Tumwesigye
Technical Advisor - DHIS2 (Consultant),
Ministry of Health/AFENET
Kampala
Uganda
IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
IT Specialist (Servers, Networks and Security, Health Information Systems
- DHIS2 ) & Solar Consultant
+256 774149 775, + 256 759 800161
"I don’t want to be anything other than what I have been - one tree hill "
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+46764147049
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Hi Uwe,
ADX will not be faster than DXF, as for ADX, the stream is first converted into DXF and then passed on to the regular importer.
Lars
On Tue, Feb 2, 2016 at 5:33 PM, Jason Pickering jason.p.pickering@gmail.com wrote:
This was a very trivial lab test,so not really conclusive at all. I would just give it a try and see. If you see differences, please let the devs know.
Given the scale of what you are attempting, have you considered using direct SQL injection? Not that I am recommending that route as there are many pitfalls, but it might be an option if implemented properly, especially considering your reported architecture.
Regards
Jason
On Tue, Feb 2, 2016, 17:04 Uwe Wahser uwe@wahser.de wrote:
Hi Jason,
thanks for sharing the links. As I can see on a quick glance, you are also
experimenting with the ADX-api - did you observe any significant performance
differences between ADX and dataValueSets apis?
Regards,
Uwe
Jason Pickering jason.p.pickering@gmail.com hat am 2. Februar 2016 um 18:21
geschrieben:
Hi Olav,
I have not worked with the DHS API per se, but have imported lots of data
using the same approach which they outline here (
I have written up a walkthrough of getting data out of one DHIS instance
and into another one, and I think the basic principles would be the same (
Metadata needs to be mapped (or created), the data needs to be reshaped,
and correctly formatted.
It should not be too difficult. I used R, but there are other examples with
Python and JavaScript on their examples page.
Regards,
Jason
On Tue, Feb 2, 2016 at 3:31 PM, Alex Tumwesigye atumwesigye@gmail.com
wrote:
Dear Uwe,
Have you tried to send data via the endpoint api/dataValueSets, it may be
faster. Just stage your data and push it once.
http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
Also to note, is how you send it, I have seen curl taking ages to submit
individual values via the api. You need to send it as once file via once
request or implement concurrency.
Alex
On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe olav.poppe@me.com wrote:
Hi Randy and Uwe,
thanks, interesting to hear you experiences. Uwe, what you are working on
sounds quite a bit more complicated, and not least with far more data. I
image that with household surveys, it would be a matter of < 100 indicators
for < 200 orgunits for 2-3 periods, i.e. a fraction of what you are dealing
with!
Olav
- jan. 2016 kl. 09.29 skrev uwe wahser uwe@wahser.de:
Hi Olav & Randy,
I am currently banging on kettle (aka Pentaho DI) to extract data from a
source-system (SQL-ERP in our case) into DHIS2 dataSets in json format. In
our current test-scenario (2 dataElements in a dataSet with a
categoryCombination of 5 categories) we are currently updating ca. 4 mio
dataValues every night in a pseudo-delta mode (reading all data from
source, comparing to what is there in DHIS2 already, then only pushing
records for creating, updating or deleting dataValues into the api: ca.
150k per night in 1 hour, initial load was 7hrs). We still have to prove,
that this is feasible when setting up the first real life dataSet where
there will be more categories and more dataElements, thus exploding the
number of dataValues.
Getting there was a bit painful, but now it seems to work. I chose kettle
instead of Talend ETL (both open source) as it seemed to be easier to get
used to. However, from a data warehouse perspective I’d prefer to have
DHIS2 offering some sort of an integrated ETL landscape on the long run,
which would also allow to aggregate data from tracker into dataSets,
tracker to tracker, dataSets to dataSets etc.
Our current version of the kettle transformations and jobs were designed
to be generic (not for a specific dataSet, but you have to design your own
extractor which could be a simple csv-reader or maybe a DHS api-call). If
you are interested, I will share them. Just be aware that they are
currently in a very early and rough state and not documented. You’d have to
bring along the willingness to dig yourself into kettle and be pain
resistant to a certain degree
I’d be interested to hear from other experiences …
Have a nice sunday,
Uwe
Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
Not here unfortunately…just doing csv imports from DHS Excel files.
Would be useful for our data warehouse.
Randy
On Jan 29, 2016 2:59 PM, “Olav Poppe” olav.poppe@me.com wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS API (
http://api.dhsprogram.com/#/index.html), and using it to import survey
results into DHIS?
Olav
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
*This message and its attachments are confidential and solely for the
intended recipients. If received in error, please delete them and notify
the sender via reply e-mail immediately.*
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Alex Tumwesigye
Technical Advisor - DHIS2 (Consultant),
Ministry of Health/AFENET
Kampala
Uganda
IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
IT Specialist (Servers, Networks and Security, Health Information Systems
- DHIS2 ) & Solar Consultant
+256 774149 775, + 256 759 800161
"I don’t want to be anything other than what I have been - one tree hill "
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+46764147049
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
Hi Randy,
currently I am just loading the bare dataSet. But you are right: a normal nightly load run should first update the meta-data and update the dataValues afterwards, otherwise you'd have values being rejected, if they were coded to new OrgUnits or category options that are not yet in DHIS2. However, we are not yet there, but that would be one of the next activities.
Also, as you have stated, the current version expects the categoryOptions to be compliant with those in DHIS2. Mappings have to be done in the custom extractor. As you state it is easier if there is no mapping needed, but from my previous DWH-experiences I know that this is normally desired since analysis data can normally grouped into broader categories than those from the operational systems, thus reducing the number of combinations in the cubes.
Our main benefit for the moment is that the ETL process compares the dataValues to what is already present in the DHIS2 and then decides whether to update an existing value, create a new value or to delete a value that doesn't not come any more (data are extracted in full, but uploaded in pseudo delta). Also the transformation from tabular data to the DHIS2 api-format is done, including the mapping to DHIS2-IDs for category option combinations.
Regards,
Uwe
---
Am 02.02.2016 um 18:04 schrieb Wilson, Randy:
One of the interesting ideas from Uwe's approach is that DHS has apparently standardized definitions for all indicators - presumably there is a code that we can use in DHIS-2 so that interoperability will be simplified. Uwe might want to extend the data element attributes to capture more of the metadata that is available in DHS to define the indicators. Also, I wonder if you plan to bring in the raw data (numerators & denominators) as data elements and build the calculations into DHIS-2, or bring in the calculated indicator values as data elements.
One of the challenges that we face in our Data Warehouse is that it contains indicators calculated based on both routine and population survey data. We have to be very careful of the indicator names so that people know which come from which source. For example: from DHS we have "Contraceptive prevalence rate - modern methods" while we estimate that from the routine HMIS data but call it "Contraceptive utilisation rate from health facilities - modern methods".
Randy
On Tue, Feb 2, 2016 at 4:13 PM, Olav Poppe <olav.poppe@me.com > <mailto:olav.poppe@me.com>> wrote:
Hi Randy and Uwe,
thanks, interesting to hear you experiences. Uwe, what you are
working on sounds quite a bit more complicated, and not least with
far more data. I image that with household surveys, it would be a
matter of < 100 indicators for < 200 orgunits for 2-3 periods,
i.e. a fraction of what you are dealing with!Olav
31. jan. 2016 kl. 09.29 skrev uwe wahser <uwe@wahser.de
<mailto:uwe@wahser.de>>:Hi Olav & Randy,
I am currently banging on kettle (aka Pentaho DI) to extract data
from a source-system (SQL-ERP in our case) into DHIS2 dataSets in
json format. In our current test-scenario (2 dataElements in a
dataSet with a categoryCombination of 5 categories) we are
currently updating ca. 4 mio dataValues every night in a
pseudo-delta mode (reading all data from source, comparing to
what is there in DHIS2 already, then only pushing records for
creating, updating or deleting dataValues into the api: ca. 150k
per night in 1 hour, initial load was 7hrs). We still have to
prove, that this is feasible when setting up the first real life
dataSet where there will be more categories and more
dataElements, thus exploding the number of dataValues.Getting there was a bit painful, but now it seems to work. I
chose kettle instead of Talend ETL (both open source) as it
seemed to be easier to get used to. However, from a data
warehouse perspective I'd prefer to have DHIS2 offering some sort
of an integrated ETL landscape on the long run, which would also
allow to aggregate data from tracker into dataSets, tracker to
tracker, dataSets to dataSets etc.Our current version of the kettle transformations and jobs were
designed to be generic (not for a specific dataSet, but you have
to design your own extractor which could be a simple csv-reader
or maybe a DHS api-call). If you are interested, I will share
them. Just be aware that they are currently in a very early and
rough state and not documented. You'd have to bring along the
willingness to dig yourself into kettle and be pain resistant to
a certain degreeI'd be interested to hear from other experiences ...
Have a nice sunday,
Uwe
---
Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
Not here unfortunately...just doing csv imports from DHS Excel
files. Would be useful for our data warehouse.
RandyOn Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com >>> <mailto:olav.poppe@me.com>> wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS
API (The DHS Program API), and using it
to import survey results into DHIS?Olav
_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
Post to : dhis2-users@lists.launchpad.net
<mailto:dhis2-users@lists.launchpad.net>
Unsubscribe : https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
More help : https://help.launchpad.net/ListHelp/This message and its attachments are confidential and solely
for the intended recipients. If received in error, please delete
them and notify the sender via reply e-mail immediately./_______________________________________________
Mailing list:https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
Post to :dhis2-users@lists.launchpad.net
<mailto:dhis2-users@lists.launchpad.net>
Unsubscribe :https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
More help :https://help.launchpad.net/ListHelp--
*Randy Wilson*
/Team Leader: //Knowledge Management, Data Use and Research/
Rwanda Health System Strengthening Activity
Management Sciences for Health
Rwanda-Kigali
Direct: +250 788308835
E-mail: rwilson@msh.org <mailto:rwilson@msh.org>
Skype: wilsonrandy_us
<http://www.msh.org/>
Stronger health systems. Greater health impact.
<Facebook; <https://twitter.com/MSHHealthImpact> <https://www.youtube.com/user/MSHHealthImpact>
www.msh.org <http://www.msh.org/>/This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately./
Ok, that sounds like ADX might even be a bit slower eventually, if the transformation process outweighs a potentially reduced datavolume. I might just stick with the json.
@Jason: I also thought about SQL-Injection shortly, but I am fearing internal changes of the data-model, which I'd have to understand fully in the first place. Of course the api's also change more than I expected, but at least that is announced
Uwe
---
Am 02.02.2016 um 19:49 schrieb Lars Helge Øverland:
Hi Uwe,
ADX will not be faster than DXF, as for ADX, the stream is first converted into DXF and then passed on to the regular importer.
Lars
On Tue, Feb 2, 2016 at 5:33 PM, Jason Pickering > <jason.p.pickering@gmail.com <mailto:jason.p.pickering@gmail.com>> wrote:
This was a very trivial lab test,so not really conclusive at all. I would just give it a try and see. If you see differences, please
let the devs know.Given the scale of what you are attempting, have you considered
using direct SQL injection? Not that I am recommending that route
as there are many pitfalls, but it might be an option if
implemented properly, especially considering your reported
architecture.Regards
JasonOn Tue, Feb 2, 2016, 17:04 Uwe Wahser <uwe@wahser.de > <mailto:uwe@wahser.de>> wrote:
Hi Jason,
thanks for sharing the links. As I can see on a quick glance,
you are also
experimenting with the ADX-api - did you observe any
significant performance
differences between ADX and dataValueSets apis?Regards,
Uwe
> Jason Pickering <jason.p.pickering@gmail.com
<mailto:jason.p.pickering@gmail.com>> hat am 2. Februar 2016
um 18:21
> geschrieben:
>
> Hi Olav,
> I have not worked with the DHS API per se, but have imported
lots of data
> using the same approach which they outline here (
> The DHS Program API)
>
> I have written up a walkthrough of getting data out of one
DHIS instance
> and into another one, and I think the basic principles would
be the same (
> http://rpubs.com/jason_p_pickering/139589\)
>
> Metadata needs to be mapped (or created), the data needs to
be reshaped,
> and correctly formatted.
>
> It should not be too difficult. I used R, but there are
other examples with
> Python and JavaScript on their examples page.
>
> Regards,
> Jason
>
> On Tue, Feb 2, 2016 at 3:31 PM, Alex Tumwesigye > <atumwesigye@gmail.com <mailto:atumwesigye@gmail.com>> > > wrote:
>
> > Dear Uwe,
> >
> > Have you tried to send data via the endpoint
api/dataValueSets, it may be
> > faster. Just stage your data and push it once.
> >
http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
> >
> > Also to note, is how you send it, I have seen curl taking
ages to submit
> > individual values via the api. You need to send it as once
file via once
> > request or implement concurrency.
> >
> > Alex
> >
> > On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe > <olav.poppe@me.com <mailto:olav.poppe@me.com>> wrote:
> >
> >> Hi Randy and Uwe,
> >> thanks, interesting to hear you experiences. Uwe, what
you are working on
> >> sounds quite a bit more complicated, and not least with
far more data. I
> >> image that with household surveys, it would be a matter
of < 100 indicators
> >> for < 200 orgunits for 2-3 periods, i.e. a fraction of
what you are dealing
> >> with!
> >>
> >> Olav
> >>
> >> 31. jan. 2016 kl. 09.29 skrev uwe wahser <uwe@wahser.de
<mailto:uwe@wahser.de>>:
> >>
> >> Hi Olav & Randy,
> >>
> >> I am currently banging on kettle (aka Pentaho DI) to
extract data from a
> >> source-system (SQL-ERP in our case) into DHIS2 dataSets
in json format. In
> >> our current test-scenario (2 dataElements in a dataSet with a
> >> categoryCombination of 5 categories) we are currently
updating ca. 4 mio
> >> dataValues every night in a pseudo-delta mode (reading
all data from
> >> source, comparing to what is there in DHIS2 already, then
only pushing
> >> records for creating, updating or deleting dataValues
into the api: ca.
> >> 150k per night in 1 hour, initial load was 7hrs). We
still have to prove,
> >> that this is feasible when setting up the first real life
dataSet where
> >> there will be more categories and more dataElements, thus
exploding the
> >> number of dataValues.
> >>
> >> Getting there was a bit painful, but now it seems to
work. I chose kettle
> >> instead of Talend ETL (both open source) as it seemed to
be easier to get
> >> used to. However, from a data warehouse perspective I'd
prefer to have
> >> DHIS2 offering some sort of an integrated ETL landscape
on the long run,
> >> which would also allow to aggregate data from tracker
into dataSets,
> >> tracker to tracker, dataSets to dataSets etc.
> >>
> >> Our current version of the kettle transformations and
jobs were designed
> >> to be generic (not for a specific dataSet, but you have
to design your own
> >> extractor which could be a simple csv-reader or maybe a
DHS api-call). If
> >> you are interested, I will share them. Just be aware that
they are
> >> currently in a very early and rough state and not
documented. You'd have to
> >> bring along the willingness to dig yourself into kettle
and be pain
> >> resistant to a certain degree
> >>
> >> I'd be interested to hear from other experiences ...
> >>
> >> Have a nice sunday,
> >>
> >> Uwe
> >>
> >> ---
> >>
> >> Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
> >>
> >> Not here unfortunately...just doing csv imports from DHS
Excel files.
> >> Would be useful for our data warehouse.
> >> Randy
> >> On Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com > <mailto:olav.poppe@me.com>> wrote:
> >>
> >>> Hi all,
> >>> I wanted to hear if anyone has any experience with the
DHS API (
> >>> The DHS Program API), and using it to
import survey
> >>> results into DHIS?
> >>>
> >>> Olav
> >>>
> >>> _______________________________________________
> >>> Mailing list: https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
> >>> Post to : dhis2-users@lists.launchpad.net
<mailto:dhis2-users@lists.launchpad.net>
> >>> Unsubscribe : https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
> >>> More help : https://help.launchpad.net/ListHelp
> >>>
> >> *This message and its attachments are confidential and
solely for the
> >> intended recipients. If received in error, please delete
them and notify
> >> the sender via reply e-mail immediately.*
> >>
> >> _______________________________________________
> >> Mailing list: https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
> >> Post to : dhis2-users@lists.launchpad.net
<mailto:dhis2-users@lists.launchpad.net>
> >> Unsubscribe : https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
> >> More help : https://help.launchpad.net/ListHelp
> >>
> >> _______________________________________________
> >> Mailing list: https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
> >> Post to : dhis2-users@lists.launchpad.net
<mailto:dhis2-users@lists.launchpad.net>
> >> Unsubscribe : https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
> >> More help : https://help.launchpad.net/ListHelp
> >>
> >
> > --
> > Alex Tumwesigye
> >
> > Technical Advisor - DHIS2 (Consultant),
> > Ministry of Health/AFENET
> > Kampala
> > Uganda
> >
> > IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
> >
> > IT Specialist (Servers, Networks and Security, Health
Information Systems
> > - DHIS2 ) & Solar Consultant
> >
> > +256 774149 775, + 256 759 800161
<tel:%2B%20256%20759%20800161>
> >
> > "I don't want to be anything other than what I have been -
one tree hill "
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
> > Post to : dhis2-users@lists.launchpad.net
<mailto:dhis2-users@lists.launchpad.net>
> > Unsubscribe : https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
> > More help : https://help.launchpad.net/ListHelp
> >
>
> --
> Jason P. Pickering
> email: jason.p.pickering@gmail.com
<mailto:jason.p.pickering@gmail.com>
> tel:+46764147049 <tel:%2B46764147049>
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
> Post to : dhis2-users@lists.launchpad.net
<mailto:dhis2-users@lists.launchpad.net>
> Unsubscribe : https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
> More help : https://help.launchpad.net/ListHelp_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
Post to : dhis2-users@lists.launchpad.net
<mailto:dhis2-users@lists.launchpad.net>
Unsubscribe : https://launchpad.net/~dhis2-users
<https://launchpad.net/~dhis2-users>
More help : https://help.launchpad.net/ListHelp--
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
http://www.dhis2.org <https://www.dhis2.org>
Lars is right that ADX won't be faster than dxf. Both because it
internally converts to dxf on import and because it abstracts away the
categoryoptioncombo. The first isn't really very costly but the other
is.
This means that that the two systems only have to match categories and
categoryoptions which is a much easier mapping to maintain.
But if you need raw speed it is going to be faster to produce dxf
style categoryoptioncombos as that is closest to the way the data gets
stored. I am going to speed up the adx import code, but will still
always be slower
On 2 February 2016 at 20:07, uwe wahser <uwe@wahser.de> wrote:
Ok, that sounds like ADX might even be a bit slower eventually, if the
transformation process outweighs a potentially reduced datavolume. I might
just stick with the json.@Jason: I also thought about SQL-Injection shortly, but I am fearing
internal changes of the data-model, which I'd have to understand fully in
the first place. Of course the api's also change more than I expected, but
at least that is announcedUwe
---
Am 02.02.2016 um 19:49 schrieb Lars Helge Øverland:
Hi Uwe,
ADX will not be faster than DXF, as for ADX, the stream is first converted
into DXF and then passed on to the regular importer.Lars
On Tue, Feb 2, 2016 at 5:33 PM, Jason Pickering > <jason.p.pickering@gmail.com> wrote:
This was a very trivial lab test,so not really conclusive at all. I would
just give it a try and see. If you see differences, please let the devs
know.Given the scale of what you are attempting, have you considered using
direct SQL injection? Not that I am recommending that route as there are
many pitfalls, but it might be an option if implemented properly, especially
considering your reported architecture.Regards
JasonOn Tue, Feb 2, 2016, 17:04 Uwe Wahser <uwe@wahser.de> wrote:
Hi Jason,
thanks for sharing the links. As I can see on a quick glance, you are
also
experimenting with the ADX-api - did you observe any significant
performance
differences between ADX and dataValueSets apis?Regards,
Uwe
> Jason Pickering <jason.p.pickering@gmail.com> hat am 2. Februar 2016 um
> 18:21
> geschrieben:
>
>
> Hi Olav,
> I have not worked with the DHS API per se, but have imported lots of
> data
> using the same approach which they outline here (
> The DHS Program API)
>
> I have written up a walkthrough of getting data out of one DHIS
> instance
> and into another one, and I think the basic principles would be the
> same (
> http://rpubs.com/jason_p_pickering/139589\)
>
> Metadata needs to be mapped (or created), the data needs to be
> reshaped,
> and correctly formatted.
>
> It should not be too difficult. I used R, but there are other examples
> with
> Python and JavaScript on their examples page.
>
> Regards,
> Jason
>
>
> On Tue, Feb 2, 2016 at 3:31 PM, Alex Tumwesigye <atumwesigye@gmail.com> >>> > wrote:
>
> > Dear Uwe,
> >
> > Have you tried to send data via the endpoint api/dataValueSets, it
> > may be
> > faster. Just stage your data and push it once.
> >
> >
> > http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
> >
> > Also to note, is how you send it, I have seen curl taking ages to
> > submit
> > individual values via the api. You need to send it as once file via
> > once
> > request or implement concurrency.
> >
> > Alex
> >
> > On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe <olav.poppe@me.com> wrote:
> >
> >> Hi Randy and Uwe,
> >> thanks, interesting to hear you experiences. Uwe, what you are
> >> working on
> >> sounds quite a bit more complicated, and not least with far more
> >> data. I
> >> image that with household surveys, it would be a matter of < 100
> >> indicators
> >> for < 200 orgunits for 2-3 periods, i.e. a fraction of what you are
> >> dealing
> >> with!
> >>
> >> Olav
> >>
> >>
> >>
> >>
> >>
> >>
> >> 31. jan. 2016 kl. 09.29 skrev uwe wahser <uwe@wahser.de>:
> >>
> >> Hi Olav & Randy,
> >>
> >> I am currently banging on kettle (aka Pentaho DI) to extract data
> >> from a
> >> source-system (SQL-ERP in our case) into DHIS2 dataSets in json
> >> format. In
> >> our current test-scenario (2 dataElements in a dataSet with a
> >> categoryCombination of 5 categories) we are currently updating ca. 4
> >> mio
> >> dataValues every night in a pseudo-delta mode (reading all data from
> >> source, comparing to what is there in DHIS2 already, then only
> >> pushing
> >> records for creating, updating or deleting dataValues into the api:
> >> ca.
> >> 150k per night in 1 hour, initial load was 7hrs). We still have to
> >> prove,
> >> that this is feasible when setting up the first real life dataSet
> >> where
> >> there will be more categories and more dataElements, thus exploding
> >> the
> >> number of dataValues.
> >>
> >> Getting there was a bit painful, but now it seems to work. I chose
> >> kettle
> >> instead of Talend ETL (both open source) as it seemed to be easier
> >> to get
> >> used to. However, from a data warehouse perspective I'd prefer to
> >> have
> >> DHIS2 offering some sort of an integrated ETL landscape on the long
> >> run,
> >> which would also allow to aggregate data from tracker into dataSets,
> >> tracker to tracker, dataSets to dataSets etc.
> >>
> >> Our current version of the kettle transformations and jobs were
> >> designed
> >> to be generic (not for a specific dataSet, but you have to design
> >> your own
> >> extractor which could be a simple csv-reader or maybe a DHS
> >> api-call). If
> >> you are interested, I will share them. Just be aware that they are
> >> currently in a very early and rough state and not documented. You'd
> >> have to
> >> bring along the willingness to dig yourself into kettle and be pain
> >> resistant to a certain degree
> >>
> >> I'd be interested to hear from other experiences ...
> >>
> >> Have a nice sunday,
> >>
> >> Uwe
> >>
> >> ---
> >>
> >> Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
> >>
> >> Not here unfortunately...just doing csv imports from DHS Excel
> >> files.
> >> Would be useful for our data warehouse.
> >> Randy
> >> On Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com> wrote:
> >>
> >>> Hi all,
> >>> I wanted to hear if anyone has any experience with the DHS API (
> >>> The DHS Program API), and using it to import
> >>> survey
> >>> results into DHIS?
> >>>
> >>> Olav
> >>>
> >>> _______________________________________________
> >>> Mailing list: https://launchpad.net/~dhis2-users
> >>> Post to : dhis2-users@lists.launchpad.net
> >>> Unsubscribe : https://launchpad.net/~dhis2-users
> >>> More help : https://help.launchpad.net/ListHelp
> >>>
> >>>
> >> *This message and its attachments are confidential and solely for
> >> the
> >> intended recipients. If received in error, please delete them and
> >> notify
> >> the sender via reply e-mail immediately.*
> >>
> >> _______________________________________________
> >> Mailing list: https://launchpad.net/~dhis2-users
> >> Post to : dhis2-users@lists.launchpad.net
> >> Unsubscribe : https://launchpad.net/~dhis2-users
> >> More help : https://help.launchpad.net/ListHelp
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Mailing list: https://launchpad.net/~dhis2-users
> >> Post to : dhis2-users@lists.launchpad.net
> >> Unsubscribe : https://launchpad.net/~dhis2-users
> >> More help : https://help.launchpad.net/ListHelp
> >>
> >>
> >
> >
> > --
> > Alex Tumwesigye
> >
> > Technical Advisor - DHIS2 (Consultant),
> > Ministry of Health/AFENET
> > Kampala
> > Uganda
> >
> > IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
> >
> > IT Specialist (Servers, Networks and Security, Health Information
> > Systems
> > - DHIS2 ) & Solar Consultant
> >
> > +256 774149 775, + 256 759 800161
> >
> > "I don't want to be anything other than what I have been - one tree
> > hill "
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~dhis2-users
> > Post to : dhis2-users@lists.launchpad.net
> > Unsubscribe : https://launchpad.net/~dhis2-users
> > More help : https://help.launchpad.net/ListHelp
> >
> >
>
>
> --
> Jason P. Pickering
> email: jason.p.pickering@gmail.com
> tel:+46764147049
> _______________________________________________
> Mailing list: https://launchpad.net/~dhis2-users
> Post to : dhis2-users@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~dhis2-users
> More help : https://help.launchpad.net/ListHelp_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp--
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
http://www.dhis2.org_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Hi Uwe,
make sure that you have tuned Postgres properly through postgresql.conf, especially the last 5 settings are crucial for getting good write performance.
http://dhis2.github.io/dhis2-docs/master/en/implementer/html/ch08s03.html#d5e464
checkpoint_segments = 32
PostgreSQL writes new transactions to a log file called WAL segments which are 16MB in size. When a number of segments have been written a checkpoint occurs. Setting this number to a larger value will thus improve performance for write-heavy systems such as DHIS 2.
checkpoint_completion_target = 0.8
Determines the percentage of segment completion before a checkpoint occurs. Setting this to a high value will thus spread the writes out and lower the average write overhead.
wal_buffers = 16MB
Sets the memory used for buffering during the WAL write process. Increasing this value might improve throughput in write-heavy systems.
synchronous_commit = off
Specifies whether transaction commits will wait for WAL records to be written to the disk before returning to the client or not. Setting this to off will improve performance considerably. It also implies that there is a slight delay between the transaction is reported successful to the client and it actually being safe, but the database state cannot be corrupted and this is a good alternative for performance-intensive and write-heavy systems like DHIS 2.
wal_writer_delay = 10000ms
Specifies the delay between WAL write operations. Setting this to a high value will improve performance on write-heavy systems since potentially many write operations can be executed within a single flush to disk.
On Tue, Feb 2, 2016 at 9:18 PM, Bob Jolliffe bobjolliffe@gmail.com wrote:
Lars is right that ADX won’t be faster than dxf. Both because it
internally converts to dxf on import and because it abstracts away the
categoryoptioncombo. The first isn’t really very costly but the other
is.
This means that that the two systems only have to match categories and
categoryoptions which is a much easier mapping to maintain.
But if you need raw speed it is going to be faster to produce dxf
style categoryoptioncombos as that is closest to the way the data gets
stored. I am going to speed up the adx import code, but will still
always be slower
On 2 February 2016 at 20:07, uwe wahser uwe@wahser.de wrote:
Ok, that sounds like ADX might even be a bit slower eventually, if the
transformation process outweighs a potentially reduced datavolume. I might
just stick with the json.
@Jason: I also thought about SQL-Injection shortly, but I am fearing
internal changes of the data-model, which I’d have to understand fully in
the first place. Of course the api’s also change more than I expected, but
at least that is announced
Uwe
Am 02.02.2016 um 19:49 schrieb Lars Helge Øverland:
Hi Uwe,
ADX will not be faster than DXF, as for ADX, the stream is first converted
into DXF and then passed on to the regular importer.
Lars
On Tue, Feb 2, 2016 at 5:33 PM, Jason Pickering
jason.p.pickering@gmail.com wrote:
This was a very trivial lab test,so not really conclusive at all. I would
just give it a try and see. If you see differences, please let the devs
know.
Given the scale of what you are attempting, have you considered using
direct SQL injection? Not that I am recommending that route as there are
many pitfalls, but it might be an option if implemented properly, especially
considering your reported architecture.
Regards
Jason
On Tue, Feb 2, 2016, 17:04 Uwe Wahser uwe@wahser.de wrote:
Hi Jason,
thanks for sharing the links. As I can see on a quick glance, you are
also
experimenting with the ADX-api - did you observe any significant
performance
differences between ADX and dataValueSets apis?
Regards,
Uwe
Jason Pickering jason.p.pickering@gmail.com hat am 2. Februar 2016 um
18:21
geschrieben:
Hi Olav,
I have not worked with the DHS API per se, but have imported lots of
data
using the same approach which they outline here (
I have written up a walkthrough of getting data out of one DHIS
instance
and into another one, and I think the basic principles would be the
same (
Metadata needs to be mapped (or created), the data needs to be
reshaped,
and correctly formatted.
It should not be too difficult. I used R, but there are other examples
with
Python and JavaScript on their examples page.
Regards,
Jason
On Tue, Feb 2, 2016 at 3:31 PM, Alex Tumwesigye atumwesigye@gmail.com
wrote:
Dear Uwe,
Have you tried to send data via the endpoint api/dataValueSets, it
may be
faster. Just stage your data and push it once.
http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
Also to note, is how you send it, I have seen curl taking ages to
submit
individual values via the api. You need to send it as once file via
once
request or implement concurrency.
Alex
On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe olav.poppe@me.com wrote:
Hi Randy and Uwe,
thanks, interesting to hear you experiences. Uwe, what you are
working on
sounds quite a bit more complicated, and not least with far more
data. I
image that with household surveys, it would be a matter of < 100
indicators
for < 200 orgunits for 2-3 periods, i.e. a fraction of what you are
dealing
with!
Olav
- jan. 2016 kl. 09.29 skrev uwe wahser uwe@wahser.de:
Hi Olav & Randy,
I am currently banging on kettle (aka Pentaho DI) to extract data
from a
source-system (SQL-ERP in our case) into DHIS2 dataSets in json
format. In
our current test-scenario (2 dataElements in a dataSet with a
categoryCombination of 5 categories) we are currently updating ca. 4
mio
dataValues every night in a pseudo-delta mode (reading all data from
source, comparing to what is there in DHIS2 already, then only
pushing
records for creating, updating or deleting dataValues into the api:
ca.
150k per night in 1 hour, initial load was 7hrs). We still have to
prove,
that this is feasible when setting up the first real life dataSet
where
there will be more categories and more dataElements, thus exploding
the
number of dataValues.
Getting there was a bit painful, but now it seems to work. I chose
kettle
instead of Talend ETL (both open source) as it seemed to be easier
to get
used to. However, from a data warehouse perspective I’d prefer to
have
DHIS2 offering some sort of an integrated ETL landscape on the long
run,
which would also allow to aggregate data from tracker into dataSets,
tracker to tracker, dataSets to dataSets etc.
Our current version of the kettle transformations and jobs were
designed
to be generic (not for a specific dataSet, but you have to design
your own
extractor which could be a simple csv-reader or maybe a DHS
api-call). If
you are interested, I will share them. Just be aware that they are
currently in a very early and rough state and not documented. You’d
have to
bring along the willingness to dig yourself into kettle and be pain
resistant to a certain degree
I’d be interested to hear from other experiences …
Have a nice sunday,
Uwe
Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
Not here unfortunately…just doing csv imports from DHS Excel
files.
Would be useful for our data warehouse.
Randy
On Jan 29, 2016 2:59 PM, “Olav Poppe” olav.poppe@me.com wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS API (
http://api.dhsprogram.com/#/index.html), and using it to import
survey
results into DHIS?
Olav
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
*This message and its attachments are confidential and solely for
the
intended recipients. If received in error, please delete them and
notify
the sender via reply e-mail immediately.*
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Alex Tumwesigye
Technical Advisor - DHIS2 (Consultant),
Ministry of Health/AFENET
Kampala
Uganda
IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
IT Specialist (Servers, Networks and Security, Health Information
Systems
- DHIS2 ) & Solar Consultant
+256 774149 775, + 256 759 800161
"I don’t want to be anything other than what I have been - one tree
hill "
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+46764147049
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
Hi Lars,
thanks for the hint. Currently I am just running on standards. I'll do a bit of
monitoring first before adjusting those values. I'll post the difference.
Regards,
Uwe
---
Lars Helge Øverland <larshelge@gmail.com> hat am 2. Februar 2016 um 23:43
geschrieben:Hi Uwe,
make sure that you have tuned Postgres properly through postgresql.conf,
especially the last 5 settings are crucial for getting good write
performance.http://dhis2.github.io/dhis2-docs/master/en/implementer/html/ch08s03.html#d5e464
checkpoint_segments = 32
PostgreSQL writes new transactions to a log file called WAL segments which
are 16MB in size. When a number of segments have been written a checkpoint
occurs. Setting this number to a larger value will thus improve performance
for write-heavy systems such as DHIS 2.checkpoint_completion_target = 0.8
Determines the percentage of segment completion before a checkpoint occurs.
Setting this to a high value will thus spread the writes out and lower the
average write overhead.wal_buffers = 16MB
Sets the memory used for buffering during the WAL write process. Increasing
this value might improve throughput in write-heavy systems.synchronous_commit = off
Specifies whether transaction commits will wait for WAL records to be
written to the disk before returning to the client or not. Setting this to
off will improve performance considerably. It also implies that there is a
slight delay between the transaction is reported successful to the client
and it actually being safe, but the database state cannot be corrupted and
this is a good alternative for performance-intensive and write-heavy
systems like DHIS 2.wal_writer_delay = 10000ms
Specifies the delay between WAL write operations. Setting this to a high
value will improve performance on write-heavy systems since potentially
many write operations can be executed within a single flush to disk.On Tue, Feb 2, 2016 at 9:18 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:
> Lars is right that ADX won't be faster than dxf. Both because it
> internally converts to dxf on import and because it abstracts away the
> categoryoptioncombo. The first isn't really very costly but the other
> is.
>
> This means that that the two systems only have to match categories and
> categoryoptions which is a much easier mapping to maintain.
>
> But if you need raw speed it is going to be faster to produce dxf
> style categoryoptioncombos as that is closest to the way the data gets
> stored. I am going to speed up the adx import code, but will still
> always be slower
>
> On 2 February 2016 at 20:07, uwe wahser <uwe@wahser.de> wrote:
> > Ok, that sounds like ADX might even be a bit slower eventually, if the
> > transformation process outweighs a potentially reduced datavolume. I
> might
> > just stick with the json.
> >
> > @Jason: I also thought about SQL-Injection shortly, but I am fearing
> > internal changes of the data-model, which I'd have to understand fully in
> > the first place. Of course the api's also change more than I expected,
> but
> > at least that is announced
> >
> > Uwe
> >
> > ---
> >
> >
> >
> > Am 02.02.2016 um 19:49 schrieb Lars Helge Øverland:
> >
> > Hi Uwe,
> >
> > ADX will not be faster than DXF, as for ADX, the stream is first
> converted
> > into DXF and then passed on to the regular importer.
> >
> > Lars
> >
> > On Tue, Feb 2, 2016 at 5:33 PM, Jason Pickering > > > <jason.p.pickering@gmail.com> wrote:
> >>
> >> This was a very trivial lab test,so not really conclusive at all. I
> would
> >> just give it a try and see. If you see differences, please let the devs
> >> know.
> >>
> >> Given the scale of what you are attempting, have you considered using
> >> direct SQL injection? Not that I am recommending that route as there are
> >> many pitfalls, but it might be an option if implemented properly,
> especially
> >> considering your reported architecture.
> >>
> >> Regards
> >> Jason
> >>
> >>
> >> On Tue, Feb 2, 2016, 17:04 Uwe Wahser <uwe@wahser.de> wrote:
> >>>
> >>> Hi Jason,
> >>>
> >>> thanks for sharing the links. As I can see on a quick glance, you are
> >>> also
> >>> experimenting with the ADX-api - did you observe any significant
> >>> performance
> >>> differences between ADX and dataValueSets apis?
> >>>
> >>> Regards,
> >>>
> >>> Uwe
> >>>
> >>> > Jason Pickering <jason.p.pickering@gmail.com> hat am 2. Februar
> 2016 um
> >>> > 18:21
> >>> > geschrieben:
> >>> >
> >>> >
> >>> > Hi Olav,
> >>> > I have not worked with the DHS API per se, but have imported lots of
> >>> > data
> >>> > using the same approach which they outline here (
> >>> > The DHS Program API)
> >>> >
> >>> > I have written up a walkthrough of getting data out of one DHIS
> >>> > instance
> >>> > and into another one, and I think the basic principles would be the
> >>> > same (
> >>> > http://rpubs.com/jason_p_pickering/139589\)
> >>> >
> >>> > Metadata needs to be mapped (or created), the data needs to be
> >>> > reshaped,
> >>> > and correctly formatted.
> >>> >
> >>> > It should not be too difficult. I used R, but there are other
> examples
> >>> > with
> >>> > Python and JavaScript on their examples page.
> >>> >
> >>> > Regards,
> >>> > Jason
> >>> >
> >>> >
> >>> > On Tue, Feb 2, 2016 at 3:31 PM, Alex Tumwesigye < > > atumwesigye@gmail.com> > > >>> > wrote:
> >>> >
> >>> > > Dear Uwe,
> >>> > >
> >>> > > Have you tried to send data via the endpoint api/dataValueSets, it
> >>> > > may be
> >>> > > faster. Just stage your data and push it once.
> >>> > >
> >>> > >
> >>> > >
> http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
> >>> > >
> >>> > > Also to note, is how you send it, I have seen curl taking ages to
> >>> > > submit
> >>> > > individual values via the api. You need to send it as once file via
> >>> > > once
> >>> > > request or implement concurrency.
> >>> > >
> >>> > > Alex
> >>> > >
> >>> > > On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe <olav.poppe@me.com> > > wrote:
> >>> > >
> >>> > >> Hi Randy and Uwe,
> >>> > >> thanks, interesting to hear you experiences. Uwe, what you are
> >>> > >> working on
> >>> > >> sounds quite a bit more complicated, and not least with far more
> >>> > >> data. I
> >>> > >> image that with household surveys, it would be a matter of < 100
> >>> > >> indicators
> >>> > >> for < 200 orgunits for 2-3 periods, i.e. a fraction of what you
> are
> >>> > >> dealing
> >>> > >> with!
> >>> > >>
> >>> > >> Olav
> >>> > >>
> >>> > >>
> >>> > >>
> >>> > >>
> >>> > >>
> >>> > >>
> >>> > >> 31. jan. 2016 kl. 09.29 skrev uwe wahser <uwe@wahser.de>:
> >>> > >>
> >>> > >> Hi Olav & Randy,
> >>> > >>
> >>> > >> I am currently banging on kettle (aka Pentaho DI) to extract data
> >>> > >> from a
> >>> > >> source-system (SQL-ERP in our case) into DHIS2 dataSets in json
> >>> > >> format. In
> >>> > >> our current test-scenario (2 dataElements in a dataSet with a
> >>> > >> categoryCombination of 5 categories) we are currently updating
> ca. 4
> >>> > >> mio
> >>> > >> dataValues every night in a pseudo-delta mode (reading all data
> from
> >>> > >> source, comparing to what is there in DHIS2 already, then only
> >>> > >> pushing
> >>> > >> records for creating, updating or deleting dataValues into the
> api:
> >>> > >> ca.
> >>> > >> 150k per night in 1 hour, initial load was 7hrs). We still have to
> >>> > >> prove,
> >>> > >> that this is feasible when setting up the first real life dataSet
> >>> > >> where
> >>> > >> there will be more categories and more dataElements, thus
> exploding
> >>> > >> the
> >>> > >> number of dataValues.
> >>> > >>
> >>> > >> Getting there was a bit painful, but now it seems to work. I chose
> >>> > >> kettle
> >>> > >> instead of Talend ETL (both open source) as it seemed to be easier
> >>> > >> to get
> >>> > >> used to. However, from a data warehouse perspective I'd prefer to
> >>> > >> have
> >>> > >> DHIS2 offering some sort of an integrated ETL landscape on the
> long
> >>> > >> run,
> >>> > >> which would also allow to aggregate data from tracker into
> dataSets,
> >>> > >> tracker to tracker, dataSets to dataSets etc.
> >>> > >>
> >>> > >> Our current version of the kettle transformations and jobs were
> >>> > >> designed
> >>> > >> to be generic (not for a specific dataSet, but you have to design
> >>> > >> your own
> >>> > >> extractor which could be a simple csv-reader or maybe a DHS
> >>> > >> api-call). If
> >>> > >> you are interested, I will share them. Just be aware that they are
> >>> > >> currently in a very early and rough state and not documented.
> You'd
> >>> > >> have to
> >>> > >> bring along the willingness to dig yourself into kettle and be
> pain
> >>> > >> resistant to a certain degree
> >>> > >>
> >>> > >> I'd be interested to hear from other experiences ...
> >>> > >>
> >>> > >> Have a nice sunday,
> >>> > >>
> >>> > >> Uwe
> >>> > >>
> >>> > >> ---
> >>> > >>
> >>> > >> Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
> >>> > >>
> >>> > >> Not here unfortunately...just doing csv imports from DHS Excel
> >>> > >> files.
> >>> > >> Would be useful for our data warehouse.
> >>> > >> Randy
> >>> > >> On Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com> wrote:
> >>> > >>
> >>> > >>> Hi all,
> >>> > >>> I wanted to hear if anyone has any experience with the DHS API (
> >>> > >>> The DHS Program API), and using it to import
> >>> > >>> survey
> >>> > >>> results into DHIS?
> >>> > >>>
> >>> > >>> Olav
> >>> > >>>
> >>> > >>> _______________________________________________
> >>> > >>> Mailing list: https://launchpad.net/~dhis2-users
> >>> > >>> Post to : dhis2-users@lists.launchpad.net
> >>> > >>> Unsubscribe : https://launchpad.net/~dhis2-users
> >>> > >>> More help : https://help.launchpad.net/ListHelp
> >>> > >>>
> >>> > >>>
> >>> > >> *This message and its attachments are confidential and solely for
> >>> > >> the
> >>> > >> intended recipients. If received in error, please delete them and
> >>> > >> notify
> >>> > >> the sender via reply e-mail immediately.*
> >>> > >>
> >>> > >> _______________________________________________
> >>> > >> Mailing list: https://launchpad.net/~dhis2-users
> >>> > >> Post to : dhis2-users@lists.launchpad.net
> >>> > >> Unsubscribe : https://launchpad.net/~dhis2-users
> >>> > >> More help : https://help.launchpad.net/ListHelp
> >>> > >>
> >>> > >>
> >>> > >>
> >>> > >>
> >>> > >> _______________________________________________
> >>> > >> Mailing list: https://launchpad.net/~dhis2-users
> >>> > >> Post to : dhis2-users@lists.launchpad.net
> >>> > >> Unsubscribe : https://launchpad.net/~dhis2-users
> >>> > >> More help : https://help.launchpad.net/ListHelp
> >>> > >>
> >>> > >>
> >>> > >
> >>> > >
> >>> > > --
> >>> > > Alex Tumwesigye
> >>> > >
> >>> > > Technical Advisor - DHIS2 (Consultant),
> >>> > > Ministry of Health/AFENET
> >>> > > Kampala
> >>> > > Uganda
> >>> > >
> >>> > > IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
> >>> > >
> >>> > > IT Specialist (Servers, Networks and Security, Health Information
> >>> > > Systems
> >>> > > - DHIS2 ) & Solar Consultant
> >>> > >
> >>> > > +256 774149 775, + 256 759 800161
> >>> > >
> >>> > > "I don't want to be anything other than what I have been - one tree
> >>> > > hill "
> >>> > >
> >>> > > _______________________________________________
> >>> > > Mailing list: https://launchpad.net/~dhis2-users
> >>> > > Post to : dhis2-users@lists.launchpad.net
> >>> > > Unsubscribe : https://launchpad.net/~dhis2-users
> >>> > > More help : https://help.launchpad.net/ListHelp
> >>> > >
> >>> > >
> >>> >
> >>> >
> >>> > --
> >>> > Jason P. Pickering
> >>> > email: jason.p.pickering@gmail.com
> >>> > tel:+46764147049
> >>> > _______________________________________________
> >>> > Mailing list: https://launchpad.net/~dhis2-users
> >>> > Post to : dhis2-users@lists.launchpad.net
> >>> > Unsubscribe : https://launchpad.net/~dhis2-users
> >>> > More help : https://help.launchpad.net/ListHelp
> >>
> >>
> >> _______________________________________________
> >> Mailing list: https://launchpad.net/~dhis2-users
> >> Post to : dhis2-users@lists.launchpad.net
> >> Unsubscribe : https://launchpad.net/~dhis2-users
> >> More help : https://help.launchpad.net/ListHelp
> >>
> >
> >
> >
> > --
> > Lars Helge Øverland
> > Lead developer, DHIS 2
> > University of Oslo
> > Skype: larshelgeoverland
> > http://www.dhis2.org
> >
> >
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~dhis2-users
> > Post to : dhis2-users@lists.launchpad.net
> > Unsubscribe : https://launchpad.net/~dhis2-users
> > More help : https://help.launchpad.net/ListHelp
> >
>--
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
http://www.dhis2.org <https://www.dhis2.org>
Sure. I usually get 2-5000 values per second on import so there should be room for some improvement.
regards,
Lars
On Wed, Feb 3, 2016 at 8:52 AM, Uwe Wahser uwe@wahser.de wrote:
Hi Lars,
thanks for the hint. Currently I am just running on standards. I’ll do a bit of
monitoring first before adjusting those values. I’ll post the difference.
Regards,
Uwe
Lars Helge Øverland larshelge@gmail.com hat am 2. Februar 2016 um 23:43
geschrieben:
Hi Uwe,
make sure that you have tuned Postgres properly through postgresql.conf,
especially the last 5 settings are crucial for getting good write
performance.
http://dhis2.github.io/dhis2-docs/master/en/implementer/html/ch08s03.html#d5e464
checkpoint_segments = 32
PostgreSQL writes new transactions to a log file called WAL segments which
are 16MB in size. When a number of segments have been written a checkpoint
occurs. Setting this number to a larger value will thus improve performance
for write-heavy systems such as DHIS 2.
checkpoint_completion_target = 0.8
Determines the percentage of segment completion before a checkpoint occurs.
Setting this to a high value will thus spread the writes out and lower the
average write overhead.
wal_buffers = 16MB
Sets the memory used for buffering during the WAL write process. Increasing
this value might improve throughput in write-heavy systems.
synchronous_commit = off
Specifies whether transaction commits will wait for WAL records to be
written to the disk before returning to the client or not. Setting this to
off will improve performance considerably. It also implies that there is a
slight delay between the transaction is reported successful to the client
and it actually being safe, but the database state cannot be corrupted and
this is a good alternative for performance-intensive and write-heavy
systems like DHIS 2.
wal_writer_delay = 10000ms
Specifies the delay between WAL write operations. Setting this to a high
value will improve performance on write-heavy systems since potentially
many write operations can be executed within a single flush to disk.
On Tue, Feb 2, 2016 at 9:18 PM, Bob Jolliffe bobjolliffe@gmail.com wrote:
Lars is right that ADX won’t be faster than dxf. Both because it
internally converts to dxf on import and because it abstracts away the
categoryoptioncombo. The first isn’t really very costly but the other
is.
This means that that the two systems only have to match categories and
categoryoptions which is a much easier mapping to maintain.
But if you need raw speed it is going to be faster to produce dxf
style categoryoptioncombos as that is closest to the way the data gets
stored. I am going to speed up the adx import code, but will still
always be slower
On 2 February 2016 at 20:07, uwe wahser uwe@wahser.de wrote:
Ok, that sounds like ADX might even be a bit slower eventually, if the
transformation process outweighs a potentially reduced datavolume. I
might
just stick with the json.
@Jason: I also thought about SQL-Injection shortly, but I am fearing
internal changes of the data-model, which I’d have to understand fully in
the first place. Of course the api’s also change more than I expected,
but
at least that is announced
Uwe
Am 02.02.2016 um 19:49 schrieb Lars Helge Øverland:
Hi Uwe,
ADX will not be faster than DXF, as for ADX, the stream is first
converted
into DXF and then passed on to the regular importer.
Lars
On Tue, Feb 2, 2016 at 5:33 PM, Jason Pickering
jason.p.pickering@gmail.com wrote:
This was a very trivial lab test,so not really conclusive at all. I
would
just give it a try and see. If you see differences, please let the devs
know.
Given the scale of what you are attempting, have you considered using
direct SQL injection? Not that I am recommending that route as there are
many pitfalls, but it might be an option if implemented properly,
especially
considering your reported architecture.
Regards
Jason
On Tue, Feb 2, 2016, 17:04 Uwe Wahser uwe@wahser.de wrote:
Hi Jason,
thanks for sharing the links. As I can see on a quick glance, you are
also
experimenting with the ADX-api - did you observe any significant
performance
differences between ADX and dataValueSets apis?
Regards,
Uwe
Jason Pickering jason.p.pickering@gmail.com hat am 2. Februar
2016 um
18:21
geschrieben:
Hi Olav,
I have not worked with the DHS API per se, but have imported lots of
data
using the same approach which they outline here (
I have written up a walkthrough of getting data out of one DHIS
instance
and into another one, and I think the basic principles would be the
same (
Metadata needs to be mapped (or created), the data needs to be
reshaped,
and correctly formatted.
It should not be too difficult. I used R, but there are other
examples
with
Python and JavaScript on their examples page.
Regards,
Jason
On Tue, Feb 2, 2016 at 3:31 PM, Alex Tumwesigye <
wrote:
Dear Uwe,
Have you tried to send data via the endpoint api/dataValueSets, it
may be
faster. Just stage your data and push it once.
http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
Also to note, is how you send it, I have seen curl taking ages to
submit
individual values via the api. You need to send it as once file via
once
request or implement concurrency.
Alex
On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe olav.poppe@me.com
wrote:
Hi Randy and Uwe,
thanks, interesting to hear you experiences. Uwe, what you are
working on
sounds quite a bit more complicated, and not least with far more
data. I
image that with household surveys, it would be a matter of < 100
indicators
for < 200 orgunits for 2-3 periods, i.e. a fraction of what you
are
dealing
with!
Olav
- jan. 2016 kl. 09.29 skrev uwe wahser uwe@wahser.de:
Hi Olav & Randy,
I am currently banging on kettle (aka Pentaho DI) to extract data
from a
source-system (SQL-ERP in our case) into DHIS2 dataSets in json
format. In
our current test-scenario (2 dataElements in a dataSet with a
categoryCombination of 5 categories) we are currently updating
ca. 4
mio
dataValues every night in a pseudo-delta mode (reading all data
from
source, comparing to what is there in DHIS2 already, then only
pushing
records for creating, updating or deleting dataValues into the
api:
ca.
150k per night in 1 hour, initial load was 7hrs). We still have to
prove,
that this is feasible when setting up the first real life dataSet
where
there will be more categories and more dataElements, thus
exploding
the
number of dataValues.
Getting there was a bit painful, but now it seems to work. I chose
kettle
instead of Talend ETL (both open source) as it seemed to be easier
to get
used to. However, from a data warehouse perspective I’d prefer to
have
DHIS2 offering some sort of an integrated ETL landscape on the
long
run,
which would also allow to aggregate data from tracker into
dataSets,
tracker to tracker, dataSets to dataSets etc.
Our current version of the kettle transformations and jobs were
designed
to be generic (not for a specific dataSet, but you have to design
your own
extractor which could be a simple csv-reader or maybe a DHS
api-call). If
you are interested, I will share them. Just be aware that they are
currently in a very early and rough state and not documented.
You’d
have to
bring along the willingness to dig yourself into kettle and be
pain
resistant to a certain degree
I’d be interested to hear from other experiences …
Have a nice sunday,
Uwe
Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
Not here unfortunately…just doing csv imports from DHS Excel
files.
Would be useful for our data warehouse.
Randy
On Jan 29, 2016 2:59 PM, “Olav Poppe” olav.poppe@me.com wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS API (
http://api.dhsprogram.com/#/index.html), and using it to import
survey
results into DHIS?
Olav
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
*This message and its attachments are confidential and solely for
the
intended recipients. If received in error, please delete them and
notify
the sender via reply e-mail immediately.*
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Alex Tumwesigye
Technical Advisor - DHIS2 (Consultant),
Ministry of Health/AFENET
Kampala
Uganda
IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
IT Specialist (Servers, Networks and Security, Health Information
Systems
- DHIS2 ) & Solar Consultant
+256 774149 775, + 256 759 800161
"I don’t want to be anything other than what I have been - one tree
hill "
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+46764147049
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
–
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
Thanks Jason. I realise I should learn R.
I was thinking that it should be fairly simple to make a DHIS app that would let you interact with the DHS API to
select a country
select a survey/year for that country
select the indicators available for that survey
If importing sub-national data, you would have to have some basic orgunit matching as well (though the number is limited for household surveys), but that’s it.
But it does not look like anyone has made that app yet unfortunately.
Olav
On Tue, Feb 2, 2016 at 3:31 PM, Alex Tumwesigye atumwesigye@gmail.com wrote:
Dear Uwe,
Have you tried to send data via the endpoint api/dataValueSets, it may be faster. Just stage your data and push it once.
http://dhis2.github.io/dhis2-docs/master/en/developer/html/ch01s13.html#d5e1372
Also to note, is how you send it, I have seen curl taking ages to submit individual values via the api. You need to send it as once file via once request or implement concurrency.
Alex
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
On Tue, Feb 2, 2016 at 5:13 PM, Olav Poppe olav.poppe@me.com wrote:
Hi Randy and Uwe,
thanks, interesting to hear you experiences. Uwe, what you are working on sounds quite a bit more complicated, and not least with far more data. I image that with household surveys, it would be a matter of < 100 indicators for < 200 orgunits for 2-3 periods, i.e. a fraction of what you are dealing with!
Olav
- jan. 2016 kl. 09.29 skrev uwe wahser uwe@wahser.de:
Hi Olav & Randy,
I am currently banging on kettle (aka Pentaho DI) to extract data from a source-system (SQL-ERP in our case) into DHIS2 dataSets in json format. In our current test-scenario (2 dataElements in a dataSet with a categoryCombination of 5 categories) we are currently updating ca. 4 mio dataValues every night in a pseudo-delta mode (reading all data from source, comparing to what is there in DHIS2 already, then only pushing records for creating, updating or deleting dataValues into the api: ca. 150k per night in 1 hour, initial load was 7hrs). We still have to prove, that this is feasible when setting up the first real life dataSet where there will be more categories and more dataElements, thus exploding the number of dataValues. Getting there was a bit painful, but now it seems to work. I chose kettle instead of Talend ETL (both open source) as it seemed to be easier to get used to. However, from a data warehouse perspective I'd prefer to have DHIS2 offering some sort of an integrated ETL landscape on the long run, which would also allow to aggregate data from tracker into dataSets, tracker to tracker, dataSets to dataSets etc. Our current version of the kettle transformations and jobs were designed to be generic (not for a specific dataSet, but you have to design your own extractor which could be a simple csv-reader or maybe a DHS api-call). If you are interested, I will share them. Just be aware that they are currently in a very early and rough state and not documented. You'd have to bring along the willingness to dig yourself into kettle and be pain resistant to a certain degree :-) I'd be interested to hear from other experiences ... Have a nice sunday, Uwe --- Am 29.01.2016 um 17:31 schrieb Wilson, Randy:
Not here unfortunately...just doing csv imports from DHS Excel files. Would be useful for our data warehouse. Randy
On Jan 29, 2016 2:59 PM, "Olav Poppe" <olav.poppe@me.com > wrote:
Hi all,
I wanted to hear if anyone has any experience with the DHS API (http://api.dhsprogram.com/#/index.html ), and using it to import survey results into DHIS?
Olav
_______________________________________________ Mailing list: [https://launchpad.net/~dhis2-users](https://launchpad.net/%7Edhis2-users) Post to : dhis2-users@lists.launchpad.net Unsubscribe : [https://launchpad.net/~dhis2-users](https://launchpad.net/%7Edhis2-users) More help : [https://help.launchpad.net/ListHelp](https://help.launchpad.net/ListHelp)
* This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately.*
_______________________________________________ Mailing list: [https://launchpad.net/~dhis2-users](https://launchpad.net/~dhis2-users) Post to : dhis2-users@lists.launchpad.net Unsubscribe : [https://launchpad.net/~dhis2-users](https://launchpad.net/~dhis2-users) More help : [https://help.launchpad.net/ListHelp](https://help.launchpad.net/ListHelp)
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Alex Tumwesigye
Technical Advisor - DHIS2 (Consultant),
Ministry of Health/AFENET
Kampala
Uganda
IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya
IT Specialist (Servers, Networks and Security, Health Information Systems - DHIS2 ) & Solar Consultant
+256 774149 775, + 256 759 800161
"I don’t want to be anything other than what I have been - one tree hill "
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+46764147049
Hi - we wrapped up a pilot a few months ago with Johns Hopkins University National Evaluation Platform:
to do something similar. In their case they wanted to bring the DHS data into stata to clean/prep it first, then export both metadata and the data itself for import into DHIS 2. We wrote a DHIS 2 user app + parallel java app to handle the import and it’s being piloted in 4 countries now, to combine it with their routine health information where possible. We didn’t use the DHS API’s and I’d have to check on what cleanup/prep JHU was doing. Our work ends up being more of a generic stata import app although I’m cc’ing in Lorill Crees who has been very involved with it and can comment more.
JHU has tasked us to open source the work so we should be able to share code. We don’t have it on online but I can check if we can send you a copy?
Our next round of work is on R integration so end users can conduct simple analyses. We’ve done our technical spikes and are doing proof of concept work now.
Aaron
On Wed, Feb 3, 2016 at 3:10 AM, Olav Poppe olav.poppe@me.com wrote:
Thanks Jason. I realise I should learn R.
I was thinking that it should be fairly simple to make a DHIS app that would let you interact with the DHS API to
- select a country
- select a survey/year for that country
- select the indicators available for that survey
If importing sub-national data, you would have to have some basic orgunit matching as well (though the number is limited for household surveys), but that’s it.
But it does not look like anyone has made that app yet unfortunately.
Olav
- feb. 2016 kl. 16.21 skrev Jason Pickering jason.p.pickering@gmail.com:
Hi Olav,
I have not worked with the DHS API per se, but have imported lots of data using the same approach which they outline here (http://api.dhsprogram.com/#/samples-r.cfm)
I have written up a walkthrough of getting data out of one DHIS instance and into another one, and I think the basic principles would be the same (http://rpubs.com/jason_p_pickering/139589)
Metadata needs to be mapped (or created), the data needs to be reshaped, and correctly formatted.
It should not be too difficult. I used R, but there are other examples with Python and JavaScript on their examples page.
Regards,
Jason