DHIS2 fake data generator

Because of a constant need of testing standard reports, I developed an application able to generate fake data.
It could be useful for developers therefore I post here the link:
http://tinyurl.com/guak9w8
Any comment is welcome

Regards

···

--
Oreste Parlatano
oreste@parlatano.org
http://oreste.in

1 Like

Hi,

I presume you want the data set exported with allocated orgunits…

The main drawback with randomly generated numbers is that a lot of things won’t make any sense with such data, like data validation rules and indicators. It will work for certain types of “techie” testing only - health workers won’t be able to relate to data and indicators that makes no sense.

A more useful approach would be having an app that can take a real-world DHIS2 database and then randomise those numbers within specific parameters:

  • that general relations between different data elements and catcombos are maintained (i.e. it changes 890 children immunised and 1000 children <1year into 925 immunised and 1045 children <1, NOT into 1834 children immunised and 287 children <1 as randomly generated numbers might do)

  • the general activity levels of different orgunittypes are maintained (i.e. big hospitals have much larger numbers than small clinics and mobiles).

My 2c worth

Regards

Calle

···

On 8 May 2016 at 06:52, Oreste Parlatano oreste@parlatano.org wrote:

Because of a constant need of testing standard reports, I developed an application able to generate fake data.

It could be useful for developers therefore I post here the link:

http://tinyurl.com/guak9w8

Any comment is welcome

Regards

Oreste Parlatano

oreste@parlatano.org

http://oreste.in


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19119

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Perfectly agree with you. The generation of fake data can be based on a more comprehensive set of meta-data, instead of just the dataSets.
I think that further development could solve the problem.
At the moment the application solves the main hassle: assembles the combination of dataElements Ids with organisationUnits Ids.
I did not write an in-depth article about because I have time just during the weekends, not all of them, but I will do it next week, in order to understand better the inner mechanism.
The results, of the actual mechanism, can be modified by means of a spreadsheet.
In case of wide interest in the application, I can proceed in further development.
Egoistically now the application solves the problems of testing some specific installations here in Mozambique.

Regards

···

--
Oreste Parlatano
oreste@parlatano.org
http://oreste.in
+39 3663165220 (Whatsapp & Viber)
+258 848973244
+258 822732803
Skype oresteafrica

On 2016-05-08 01:06, Calle Hedberg wrote:

Hi,

I presume you want the data set exported with allocated orgunits....

The main drawback with randomly generated numbers is that a lot of
things won't make any sense with such data, like data validation rules
and indicators. It will work for certain types of "techie" testing
only - health workers won't be able to relate to data and indicators
that makes no sense.

A more useful approach would be having an app that can take a
real-world DHIS2 database and then randomise those numbers within
specific parameters:
- that general relations between different data elements and catcombos
are maintained (i.e. it changes 890 children immunised and 1000
children <1year into 925 immunised and 1045 children <1, NOT into 1834
children immunised and 287 children <1 as randomly generated numbers
might do)
- the general activity levels of different orgunittypes are maintained
(i.e. big hospitals have much larger numbers than small clinics and
mobiles).

My 2c worth

Regards
Calle

On 8 May 2016 at 06:52, Oreste Parlatano <oreste@parlatano.org> wrote:

Because of a constant need of testing standard reports, I developed
an application able to generate fake data.
It could be useful for developers therefore I post here the link:
http://tinyurl.com/guak9w8
Any comment is welcome

Regards

--
Oreste Parlatano
oreste@parlatano.org
http://oreste.in

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp

--

*******************************************

Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19119

Email: calle.hedberg@gmail.com

Skype: calle_hedberg

*******************************************

Interesting, I didn't try with with tracker.
The application get meta-data just from dataSets, the main goal is to combine properly organisationUnit Ids with dataElements Ids.
The xml with dataSets contain just that information.
Further development could include the information about the type of data and validation rules, therefore the generation of fake data could be accurate.

···

--
Oreste Parlatano
oreste@parlatano.org
http://oreste.in
+39 3663165220 (Whatsapp & Viber)
+258 848973244
+258 822732803
Skype oresteafrica

On 2016-05-09 09:34, Dan Cocos wrote:

I had made an attempt at this using the dhis-adhoc code, though it
wasn't completed ) because there were a few complexities (I'll try to
dig up the source to share but it will probably need several updates
to fix as this was when tracker was the Patient Tracker.

What I attempted was to check the datavalue type and randomize the
current values based on the range set.
Removed all first name, last name (this was back when it was the
patient tracker)
Reset all of the email address and phone numbers (so that people don't
get alerts from the demo system)

The two complexities that I ran into is that people often would insert
data using SQL and so you'd find 0.0 or other inconsistencies where
it should be an integer and other data related issues.
The second is that people can put patient names and private data in
any of the text fields so you'll have to replace all text with Lorem
Ipsum to make sure it is anonymized.

On May 8, 2016, at 9:17 AM, Oreste Parlatano <oreste@parlatano.org> >> wrote:

Perfectly agree with you. The generation of fake data can be based on a more comprehensive set of meta-data, instead of just the dataSets.
I think that further development could solve the problem.
At the moment the application solves the main hassle: assembles the combination of dataElements Ids with organisationUnits Ids.
I did not write an in-depth article about because I have time just during the weekends, not all of them, but I will do it next week, in order to understand better the inner mechanism.
The results, of the actual mechanism, can be modified by means of a spreadsheet.
In case of wide interest in the application, I can proceed in further development.
Egoistically now the application solves the problems of testing some specific installations here in Mozambique.

Regards

Oreste Parlatano
oreste@parlatano.org
http://oreste.in

On 2016-05-08 01:06, Calle Hedberg wrote:

Hi,
I presume you want the data set exported with allocated orgunits....
The main drawback with randomly generated numbers is that a lot of
things won't make any sense with such data, like data validation rules
and indicators. It will work for certain types of "techie" testing
only - health workers won't be able to relate to data and indicators
that makes no sense.
A more useful approach would be having an app that can take a
real-world DHIS2 database and then randomise those numbers within
specific parameters:
- that general relations between different data elements and catcombos
are maintained (i.e. it changes 890 children immunised and 1000
children <1year into 925 immunised and 1045 children <1, NOT into 1834
children immunised and 287 children <1 as randomly generated numbers
might do)
- the general activity levels of different orgunittypes are maintained
(i.e. big hospitals have much larger numbers than small clinics and
mobiles).
My 2c worth
Regards
Calle
On 8 May 2016 at 06:52, Oreste Parlatano <oreste@parlatano.org> >>> wrote:

Because of a constant need of testing standard reports, I developed
an application able to generate fake data.
It could be useful for developers therefore I post here the link:
http://tinyurl.com/guak9w8
Any comment is welcome
Regards
--
Oreste Parlatano
oreste@parlatano.org
http://oreste.in
_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp

--
*******************************************
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19119
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
*******************************************

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp