Generating Min/ Max

I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?

Hi Rodolfo,

Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.

Regards

···

On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:

I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776

Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.

Are Max/ Min of a Data Element/ Org Unit exposed in the API ?

R

···

On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:

Hi Rodolfo,

Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.

Regards

Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:

I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776

Hi,

Sorry, but if the automatic min-max calculation is simply retrieving the historical minimum and maximum values, then that makes little sense and would have no real value.

I don’t have time to verify it right now, but my assumption has always been that DHIS2 is using a method similar to the one in DHIS 1.4:

The user specify a period to be used for the min/max analysis - typically 12-18 months (longer is better in a stable health establishment environment, but a shorter period might be optimal in areas where patient numbers are changing rapidly)

The average and standard deviation is calculated for each OrganisationUnit and DataElement combination

The min is set to the average minus stdev x constant, max is set to average plus stdev x constant. Typically constants are 1.5 - 2.0 (There is an “Data analysis std dev factor” specified under General Settings - by default set to 2.0. System might be using that).

Best regards

Calle

···

On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:

Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.

Are Max/ Min of a Data Element/ Org Unit exposed in the API ?

R


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:

Hi Rodolfo,

Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.

Regards

On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:

I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


HI,

I think as Prosper says, it is simply the overall max and min, and this can be set by the user, or calculated externally.

Calle, use of standard deviation is problematic for several reasons however, mostly because it makes an assumption that the data is actually normally distributed, which is not really always the case. This may be appropriate for some data elements, but in many cases, it is not an appropriate statistical assumption, and results which we have seen after often zero inflated or follow something more like a logistical distribution (as opposed to a normal distribution). So applying something like a standard deviation may (and does in the case of DHIS2) result in many negative min values. So, although I think the method of DHIS 1.4 may be somewhat better, it still is not really always appropriate, as the assumption of a normal distribution is simply not always warranted.

Regards,

Jason

···

On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:

Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.

Are Max/ Min of a Data Element/ Org Unit exposed in the API ?

R


Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:

Hi Rodolfo,

Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.

Regards

On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:

I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776

Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


my test also shows that only the min/max across all values of an org unit is use for setting mix/max. As this is unlikely to change any soon, I just want to understand if I can set the min/max values based on my own logic, and then inject it via the API.

···

On Mon, Apr 20, 2015 at 2:14 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

HI,

I think as Prosper says, it is simply the overall max and min, and this can be set by the user, or calculated externally.

Calle, use of standard deviation is problematic for several reasons however, mostly because it makes an assumption that the data is actually normally distributed, which is not really always the case. This may be appropriate for some data elements, but in many cases, it is not an appropriate statistical assumption, and results which we have seen after often zero inflated or follow something more like a logistical distribution (as opposed to a normal distribution). So applying something like a standard deviation may (and does in the case of DHIS2) result in many negative min values. So, although I think the method of DHIS 1.4 may be somewhat better, it still is not really always appropriate, as the assumption of a normal distribution is simply not always warranted.

Regards,

Jason

On Mon, Apr 20, 2015 at 3:02 PM Calle Hedberg calle.hedberg@gmail.com wrote:

Hi,

Sorry, but if the automatic min-max calculation is simply retrieving the historical minimum and maximum values, then that makes little sense and would have no real value.

I don’t have time to verify it right now, but my assumption has always been that DHIS2 is using a method similar to the one in DHIS 1.4:

The user specify a period to be used for the min/max analysis - typically 12-18 months (longer is better in a stable health establishment environment, but a shorter period might be optimal in areas where patient numbers are changing rapidly)

The average and standard deviation is calculated for each OrganisationUnit and DataElement combination

The min is set to the average minus stdev x constant, max is set to average plus stdev x constant. Typically constants are 1.5 - 2.0 (There is an “Data analysis std dev factor” specified under General Settings - by default set to 2.0. System might be using that).

Best regards

Calle


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:

Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.

Are Max/ Min of a Data Element/ Org Unit exposed in the API ?

R


Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:

Hi Rodolfo,

Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.

Regards

On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:

I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776

Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


I think that would work, but as far as I know you cannot import these via the API, but it is pretty trivial do it directly into the database.

You might be able to script it however by calling to something like

https://apps.dhis2.org/demo/dhis-web-dataentry/saveMinMaxLimits.action?organisationUnitId=ImspTQPwCqd&dataElementId=BOSZApCrBni&categoryOptionComboId=TkDhg29x18A&minLimit=10&maxLimit=20

Maybe the devs can suggest a better way, but in the past, we have pulled the data values into statistical software, calculated the min/max according to a given model, and then injected them back via SQL.

Regards,
Jason

···

Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 2:14 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

HI,

I think as Prosper says, it is simply the overall max and min, and this can be set by the user, or calculated externally.

Calle, use of standard deviation is problematic for several reasons however, mostly because it makes an assumption that the data is actually normally distributed, which is not really always the case. This may be appropriate for some data elements, but in many cases, it is not an appropriate statistical assumption, and results which we have seen after often zero inflated or follow something more like a logistical distribution (as opposed to a normal distribution). So applying something like a standard deviation may (and does in the case of DHIS2) result in many negative min values. So, although I think the method of DHIS 1.4 may be somewhat better, it still is not really always appropriate, as the assumption of a normal distribution is simply not always warranted.

Regards,

Jason

On Mon, Apr 20, 2015 at 3:02 PM Calle Hedberg calle.hedberg@gmail.com wrote:

Hi,

Sorry, but if the automatic min-max calculation is simply retrieving the historical minimum and maximum values, then that makes little sense and would have no real value.

I don’t have time to verify it right now, but my assumption has always been that DHIS2 is using a method similar to the one in DHIS 1.4:

The user specify a period to be used for the min/max analysis - typically 12-18 months (longer is better in a stable health establishment environment, but a shorter period might be optimal in areas where patient numbers are changing rapidly)

The average and standard deviation is calculated for each OrganisationUnit and DataElement combination

The min is set to the average minus stdev x constant, max is set to average plus stdev x constant. Typically constants are 1.5 - 2.0 (There is an “Data analysis std dev factor” specified under General Settings - by default set to 2.0. System might be using that).

Best regards

Calle


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:

Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.

Are Max/ Min of a Data Element/ Org Unit exposed in the API ?

R


Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:

Hi Rodolfo,

Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.

Regards

On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:

I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776

Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Hi,

No generic model will fit ALL types of data elements, obviously - but it needs to fit the most typical data elements where you want to USE the min-max to capture e.g. reporting or capturing mistakes. If using a stdev model, you reset any negative values (which is impossible) to zero.

Using the historic min and max in general makes the min-max useless for such pattern recognition, because any outbreak or campaign or disruption in a service for whatever reason will permanently affect the min/max so that they no longer represent the “normal” variation of a data element.

But it seems I was wrong about this being implemented in DHIS2, then. Jason’s comment that “in the past, we have pulled the data values into statistical software, calculated the min/max according to a given model, and then injected them back via SQL” clearly indicates what is required. We can argue over the model or models to be used, but some kind of model depicting normal distribution of values over time is clearly required. you cannot expect average users to be able to pull data values into STATA or SAS or SPSS, calculate min/max usign a given model, and then re-insert the values using SQL.

I’ll write a blue-print for this - either as part of the core or as an app - when I get my head above the water :slight_smile:

Regards

Calle

···

On 20 April 2015 at 15:26, Jason Pickering jason.p.pickering@gmail.com wrote:

I think that would work, but as far as I know you cannot import these via the API, but it is pretty trivial do it directly into the database.

You might be able to script it however by calling to something like

https://apps.dhis2.org/demo/dhis-web-dataentry/saveMinMaxLimits.action?organisationUnitId=ImspTQPwCqd&dataElementId=BOSZApCrBni&categoryOptionComboId=TkDhg29x18A&minLimit=10&maxLimit=20

Maybe the devs can suggest a better way, but in the past, we have pulled the data values into statistical software, calculated the min/max according to a given model, and then injected them back via SQL.

Regards,
Jason

On Mon, Apr 20, 2015 at 3:22 PM Rodolfo Melia rmelia@knowming.com wrote:

my test also shows that only the min/max across all values of an org unit is use for setting mix/max. As this is unlikely to change any soon, I just want to understand if I can set the min/max values based on my own logic, and then inject it via the API.

Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 2:14 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

HI,

I think as Prosper says, it is simply the overall max and min, and this can be set by the user, or calculated externally.

Calle, use of standard deviation is problematic for several reasons however, mostly because it makes an assumption that the data is actually normally distributed, which is not really always the case. This may be appropriate for some data elements, but in many cases, it is not an appropriate statistical assumption, and results which we have seen after often zero inflated or follow something more like a logistical distribution (as opposed to a normal distribution). So applying something like a standard deviation may (and does in the case of DHIS2) result in many negative min values. So, although I think the method of DHIS 1.4 may be somewhat better, it still is not really always appropriate, as the assumption of a normal distribution is simply not always warranted.

Regards,

Jason

On Mon, Apr 20, 2015 at 3:02 PM Calle Hedberg calle.hedberg@gmail.com wrote:

Hi,

Sorry, but if the automatic min-max calculation is simply retrieving the historical minimum and maximum values, then that makes little sense and would have no real value.

I don’t have time to verify it right now, but my assumption has always been that DHIS2 is using a method similar to the one in DHIS 1.4:

The user specify a period to be used for the min/max analysis - typically 12-18 months (longer is better in a stable health establishment environment, but a shorter period might be optimal in areas where patient numbers are changing rapidly)

The average and standard deviation is calculated for each OrganisationUnit and DataElement combination

The min is set to the average minus stdev x constant, max is set to average plus stdev x constant. Typically constants are 1.5 - 2.0 (There is an “Data analysis std dev factor” specified under General Settings - by default set to 2.0. System might be using that).

Best regards

Calle


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:

Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.

Are Max/ Min of a Data Element/ Org Unit exposed in the API ?

R


Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:

Hi Rodolfo,

Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.

Regards

On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:

I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776

Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg



Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Sure, and that is my point that generation of min/max is not something “Normal users” would do. This is a potentially complex operation, which requires statistical knowledge.

But again, your insistence that it is calculated according to a “normal model”, I think it simply not right. It may work, but it might be worth to confirm whether it is a valid assumption with you data before writing a blueprint!

Regards,
Jason

···

On 20 April 2015 at 15:26, Jason Pickering jason.p.pickering@gmail.com wrote:

I think that would work, but as far as I know you cannot import these via the API, but it is pretty trivial do it directly into the database.

You might be able to script it however by calling to something like

https://apps.dhis2.org/demo/dhis-web-dataentry/saveMinMaxLimits.action?organisationUnitId=ImspTQPwCqd&dataElementId=BOSZApCrBni&categoryOptionComboId=TkDhg29x18A&minLimit=10&maxLimit=20

Maybe the devs can suggest a better way, but in the past, we have pulled the data values into statistical software, calculated the min/max according to a given model, and then injected them back via SQL.

Regards,
Jason

On Mon, Apr 20, 2015 at 3:22 PM Rodolfo Melia rmelia@knowming.com wrote:

my test also shows that only the min/max across all values of an org unit is use for setting mix/max. As this is unlikely to change any soon, I just want to understand if I can set the min/max values based on my own logic, and then inject it via the API.


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 2:14 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

HI,

I think as Prosper says, it is simply the overall max and min, and this can be set by the user, or calculated externally.

Calle, use of standard deviation is problematic for several reasons however, mostly because it makes an assumption that the data is actually normally distributed, which is not really always the case. This may be appropriate for some data elements, but in many cases, it is not an appropriate statistical assumption, and results which we have seen after often zero inflated or follow something more like a logistical distribution (as opposed to a normal distribution). So applying something like a standard deviation may (and does in the case of DHIS2) result in many negative min values. So, although I think the method of DHIS 1.4 may be somewhat better, it still is not really always appropriate, as the assumption of a normal distribution is simply not always warranted.

Regards,

Jason

On Mon, Apr 20, 2015 at 3:02 PM Calle Hedberg calle.hedberg@gmail.com wrote:

Hi,

Sorry, but if the automatic min-max calculation is simply retrieving the historical minimum and maximum values, then that makes little sense and would have no real value.

I don’t have time to verify it right now, but my assumption has always been that DHIS2 is using a method similar to the one in DHIS 1.4:

The user specify a period to be used for the min/max analysis - typically 12-18 months (longer is better in a stable health establishment environment, but a shorter period might be optimal in areas where patient numbers are changing rapidly)

The average and standard deviation is calculated for each OrganisationUnit and DataElement combination

The min is set to the average minus stdev x constant, max is set to average plus stdev x constant. Typically constants are 1.5 - 2.0 (There is an “Data analysis std dev factor” specified under General Settings - by default set to 2.0. System might be using that).

Best regards

Calle


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:

Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.

Are Max/ Min of a Data Element/ Org Unit exposed in the API ?

R


Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:

Hi Rodolfo,

Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.

Regards

On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:

I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776

Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Hi there,

Calle is right here - we do average, then calculate std dev and set the
upper and lower bounds for each value.

We use data from ALL available time periods to calculate this (period org
unit, data element, option combo)

Mind you we should not really debate whether to use standard deviations or
not, rather if we should support additional _distributions_ to better
handle different kinds of data. We currently use the normal distribution
<http://en.wikipedia.org/wiki/Normal_distribution&gt;\.

Rodolfo - supporting min-max in the Web API is a good idea to allow for
third-party tools - feel free to write a blueprint.

regards,

Lars

The number of std dev is by default 2, but can be set as desired from apps

···

settings > general settings.

Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.

https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html

Anyway, support for import via the API would be good.

Regards,

Jason

···

Hi there,

Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.

We use data from ALL available time periods to calculate this (period org unit, data element, option combo)

Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.

Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.

regards,

Lars

Hi

“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”

Here and there and back again :slight_smile:

So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.

I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.

So there are still some room for improvement.

Regards
Calle

···

On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:

Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.

https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html

Anyway, support for import via the API would be good.

Regards,

Jason

On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:

Hi there,

Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.

We use data from ALL available time periods to calculate this (period org unit, data element, option combo)

Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.

Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.

regards,

Lars


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Thanks everyone. It is good to understand how the max/min is calculated in DHIS (the documentation should be updated with the content of this email). We definitely need other ways to set min/ max values - my case has a specific logic, which needs to be implemented via SQL, until there is a way to push these values via the API.

Can I confirm that the Min/ Max are set by

  • Data Element

  • Org Unit

  • Cat Combo

  • Attribute Combo?

R

···

On Mon, Apr 20, 2015 at 4:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:

Hi

“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”

Here and there and back again :slight_smile:

So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.

I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.

So there are still some room for improvement.

Regards
Calle


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:

Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.

https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html

Anyway, support for import via the API would be good.

Regards,

Jason

On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Hi there,

Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.

We use data from ALL available time periods to calculate this (period org unit, data element, option combo)

Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.

Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.

regards,

Lars

The minmax dataelement values are set by:

  • data element

  • org unit

  • category option combo

···

On Mon, Apr 20, 2015 at 5:27 PM, Rodolfo Melia rmelia@knowming.com wrote:

Thanks everyone. It is good to understand how the max/min is calculated in DHIS (the documentation should be updated with the content of this email). We definitely need other ways to set min/ max values - my case has a specific logic, which needs to be implemented via SQL, until there is a way to push these values via the API.

Can I confirm that the Min/ Max are set by

  • Data Element
  • Org Unit
  • Cat Combo
  • Attribute Combo?

R


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 4:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:

Hi

“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”

Here and there and back again :slight_smile:

So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.

I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.

So there are still some room for improvement.

Regards
Calle


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:

Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.

https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html

Anyway, support for import via the API would be good.

Regards,

Jason

On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Hi there,

Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.

We use data from ALL available time periods to calculate this (period org unit, data element, option combo)

Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.

Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.

regards,

Lars

Is the attribute likely to be added to the mix?

  • the logic that IPPF uses for setting the maximum value is different for Attribute 1 and Attribute 2 of a data value.

R

···

On Mon, Apr 20, 2015 at 4:48 PM, Lars Helge Øverland larshelge@gmail.com wrote:

The minmax dataelement values are set by:

  • data element
  • org unit
  • category option combo

Lars

Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 5:27 PM, Rodolfo Melia rmelia@knowming.com wrote:

Thanks everyone. It is good to understand how the max/min is calculated in DHIS (the documentation should be updated with the content of this email). We definitely need other ways to set min/ max values - my case has a specific logic, which needs to be implemented via SQL, until there is a way to push these values via the API.

Can I confirm that the Min/ Max are set by

  • Data Element
  • Org Unit
  • Cat Combo
  • Attribute Combo?

R


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 4:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:

Hi

“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”

Here and there and back again :slight_smile:

So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.

I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.

So there are still some room for improvement.

Regards
Calle


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:

Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.

https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html

Anyway, support for import via the API would be good.

Regards,

Jason

On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Hi there,

Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.

We use data from ALL available time periods to calculate this (period org unit, data element, option combo)

Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.

Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.

regards,

Lars

Hi Rodolfo,

I think including attribute option combo to min-max data element makes sense. The problem of course is when you want a min-max to apply to all attribute option combos - not a specific one - which I can only guess is the main use-case. So it has to be optional. It can of course be done but requires some work and will be slightly complex.

Feel free to write a blueprint about with with some more detail.

best regards,

Lars

···

On Mon, Apr 20, 2015 at 5:50 PM, Rodolfo Melia rmelia@knowming.com wrote:

Is the attribute likely to be added to the mix?

  • the logic that IPPF uses for setting the maximum value is different for Attribute 1 and Attribute 2 of a data value.

R

Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 4:48 PM, Lars Helge Øverland larshelge@gmail.com wrote:

The minmax dataelement values are set by:

  • data element
  • org unit
  • category option combo

Lars

On Mon, Apr 20, 2015 at 5:27 PM, Rodolfo Melia rmelia@knowming.com wrote:

Thanks everyone. It is good to understand how the max/min is calculated in DHIS (the documentation should be updated with the content of this email). We definitely need other ways to set min/ max values - my case has a specific logic, which needs to be implemented via SQL, until there is a way to push these values via the API.

Can I confirm that the Min/ Max are set by

  • Data Element
  • Org Unit
  • Cat Combo
  • Attribute Combo?

R


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Rodolfo Meliá

*Principal | *rmelia@knowming.com

Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636

www.knowming.com

On Mon, Apr 20, 2015 at 4:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:

Hi

“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”

Here and there and back again :slight_smile:

So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.

I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.

So there are still some room for improvement.

Regards
Calle


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:

Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.

https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html

Anyway, support for import via the API would be good.

Regards,

Jason

On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Hi there,

Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.

We use data from ALL available time periods to calculate this (period org unit, data element, option combo)

Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.

Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.

regards,

Lars

Hi Calle,

I agree it makes sense to have a “from date” for the data values to include in the std dev and average calculation. I have changed it so it now includes data 2 years before the start date of the validation analysis period. I also helps on performance of the validation process.

regards,

Lars

···

On Mon, Apr 20, 2015 at 5:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:

Hi

“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”

Here and there and back again :slight_smile:

So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.

I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.

So there are still some room for improvement.

Regards
Calle

On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:

Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.

https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html

Anyway, support for import via the API would be good.

Regards,

Jason

On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Hi there,

Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.

We use data from ALL available time periods to calculate this (period org unit, data element, option combo)

Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.

Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.

regards,

Lars

Lars,

Excellent - thanks for that. Two years is a reasonable default value - we’ve always used 18 months as the default in 1.4, so almost the same.

I would nevertheless argue that

(a) user-defined period, stdev value, and possibly average/median parameters should ideally be specified on a per data element basis;

(b) adding the attribute option combo to the mix is probably required to cater for instances where data is captured for e.g. multiple collaborating NGOs;

(c) tools enabling the specification of said parameters for larger groups of data elements will make it easier to manage.

(d) a cherry on top would be the ability to adjust for typical seasonal fluctuations.

I will try to write a blue-print for something like the above, not a critical need, but a positive step.

Regards

Calle

···

On 3 May 2015 at 14:12, Lars Helge Øverland larshelge@gmail.com wrote:

Hi Calle,

I agree it makes sense to have a “from date” for the data values to include in the std dev and average calculation. I have changed it so it now includes data 2 years before the start date of the validation analysis period. I also helps on performance of the validation process.

regards,

Lars

On Mon, Apr 20, 2015 at 5:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:

Hi

“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”

Here and there and back again :slight_smile:

So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.

I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.

So there are still some room for improvement.

Regards
Calle

On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:

Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.

https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html

Anyway, support for import via the API would be good.

Regards,

Jason

On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Hi there,

Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.

We use data from ALL available time periods to calculate this (period org unit, data element, option combo)

Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.

Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.

regards,

Lars


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Lars,

By the way - last week I saw that the bug related to OrgUnit counts in indicators is still there. I checked the Sierra Leone demo too, and it’s the same - integrity checks are still showing “org-unit-do not exist” etc.

Is this an indicator bug or an integrity check bug?

Regards
Calle

···

On 3 May 2015 at 21:29, Calle Hedberg calle.hedberg@gmail.com wrote:

Lars,

Excellent - thanks for that. Two years is a reasonable default value - we’ve always used 18 months as the default in 1.4, so almost the same.

I would nevertheless argue that

(a) user-defined period, stdev value, and possibly average/median parameters should ideally be specified on a per data element basis;

(b) adding the attribute option combo to the mix is probably required to cater for instances where data is captured for e.g. multiple collaborating NGOs;

(c) tools enabling the specification of said parameters for larger groups of data elements will make it easier to manage.

(d) a cherry on top would be the ability to adjust for typical seasonal fluctuations.

I will try to write a blue-print for something like the above, not a critical need, but a positive step.

Regards

Calle

On 3 May 2015 at 14:12, Lars Helge Øverland larshelge@gmail.com wrote:

Hi Calle,

I agree it makes sense to have a “from date” for the data values to include in the std dev and average calculation. I have changed it so it now includes data 2 years before the start date of the validation analysis period. I also helps on performance of the validation process.

regards,

Lars


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


On Mon, Apr 20, 2015 at 5:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:

Hi

“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”

Here and there and back again :slight_smile:

So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.

I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.

So there are still some room for improvement.

Regards
Calle

On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:

Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.

https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html

Anyway, support for import via the API would be good.

Regards,

Jason

On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg


Hi there,

Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.

We use data from ALL available time periods to calculate this (period org unit, data element, option combo)

Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.

Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.

regards,

Lars


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19274

Email: calle.hedberg@gmail.com

Skype: calle_hedberg