I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?
Hi Rodolfo,
Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.
Regards
···
On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:
I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776
Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.
Are Max/ Min of a Data Element/ Org Unit exposed in the API ?
R
···
On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:
Hi Rodolfo,
Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.
Regards
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:
I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776
–
Hi,
Sorry, but if the automatic min-max calculation is simply retrieving the historical minimum and maximum values, then that makes little sense and would have no real value.
I don’t have time to verify it right now, but my assumption has always been that DHIS2 is using a method similar to the one in DHIS 1.4:
The user specify a period to be used for the min/max analysis - typically 12-18 months (longer is better in a stable health establishment environment, but a shorter period might be optimal in areas where patient numbers are changing rapidly)
The average and standard deviation is calculated for each OrganisationUnit and DataElement combination
The min is set to the average minus stdev x constant, max is set to average plus stdev x constant. Typically constants are 1.5 - 2.0 (There is an “Data analysis std dev factor” specified under General Settings - by default set to 2.0. System might be using that).
Best regards
Calle
···
On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:
Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.
Are Max/ Min of a Data Element/ Org Unit exposed in the API ?
R
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
–
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:
Hi Rodolfo,
Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.
Regards
On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:
I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
HI,
I think as Prosper says, it is simply the overall max and min, and this can be set by the user, or calculated externally.
Calle, use of standard deviation is problematic for several reasons however, mostly because it makes an assumption that the data is actually normally distributed, which is not really always the case. This may be appropriate for some data elements, but in many cases, it is not an appropriate statistical assumption, and results which we have seen after often zero inflated or follow something more like a logistical distribution (as opposed to a normal distribution). So applying something like a standard deviation may (and does in the case of DHIS2) result in many negative min values. So, although I think the method of DHIS 1.4 may be somewhat better, it still is not really always appropriate, as the assumption of a normal distribution is simply not always warranted.
Regards,
Jason
···
On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:
Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.
Are Max/ Min of a Data Element/ Org Unit exposed in the API ?
R
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:
Hi Rodolfo,
Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.
Regards
On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:
I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776
–
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
my test also shows that only the min/max across all values of an org unit is use for setting mix/max. As this is unlikely to change any soon, I just want to understand if I can set the min/max values based on my own logic, and then inject it via the API.
···
On Mon, Apr 20, 2015 at 2:14 PM, Jason Pickering jason.p.pickering@gmail.com wrote:
HI,
I think as Prosper says, it is simply the overall max and min, and this can be set by the user, or calculated externally.
Calle, use of standard deviation is problematic for several reasons however, mostly because it makes an assumption that the data is actually normally distributed, which is not really always the case. This may be appropriate for some data elements, but in many cases, it is not an appropriate statistical assumption, and results which we have seen after often zero inflated or follow something more like a logistical distribution (as opposed to a normal distribution). So applying something like a standard deviation may (and does in the case of DHIS2) result in many negative min values. So, although I think the method of DHIS 1.4 may be somewhat better, it still is not really always appropriate, as the assumption of a normal distribution is simply not always warranted.
Regards,
Jason
On Mon, Apr 20, 2015 at 3:02 PM Calle Hedberg calle.hedberg@gmail.com wrote:
Hi,
Sorry, but if the automatic min-max calculation is simply retrieving the historical minimum and maximum values, then that makes little sense and would have no real value.
I don’t have time to verify it right now, but my assumption has always been that DHIS2 is using a method similar to the one in DHIS 1.4:
The user specify a period to be used for the min/max analysis - typically 12-18 months (longer is better in a stable health establishment environment, but a shorter period might be optimal in areas where patient numbers are changing rapidly)
The average and standard deviation is calculated for each OrganisationUnit and DataElement combination
The min is set to the average minus stdev x constant, max is set to average plus stdev x constant. Typically constants are 1.5 - 2.0 (There is an “Data analysis std dev factor” specified under General Settings - by default set to 2.0. System might be using that).
Best regards
Calle
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:
Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.
Are Max/ Min of a Data Element/ Org Unit exposed in the API ?
R
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:
Hi Rodolfo,
Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.
Regards
On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:
I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776
–
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
I think that would work, but as far as I know you cannot import these via the API, but it is pretty trivial do it directly into the database.
You might be able to script it however by calling to something like
Maybe the devs can suggest a better way, but in the past, we have pulled the data values into statistical software, calculated the min/max according to a given model, and then injected them back via SQL.
Regards,
Jason
···
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 2:14 PM, Jason Pickering jason.p.pickering@gmail.com wrote:
HI,
I think as Prosper says, it is simply the overall max and min, and this can be set by the user, or calculated externally.
Calle, use of standard deviation is problematic for several reasons however, mostly because it makes an assumption that the data is actually normally distributed, which is not really always the case. This may be appropriate for some data elements, but in many cases, it is not an appropriate statistical assumption, and results which we have seen after often zero inflated or follow something more like a logistical distribution (as opposed to a normal distribution). So applying something like a standard deviation may (and does in the case of DHIS2) result in many negative min values. So, although I think the method of DHIS 1.4 may be somewhat better, it still is not really always appropriate, as the assumption of a normal distribution is simply not always warranted.
Regards,
Jason
On Mon, Apr 20, 2015 at 3:02 PM Calle Hedberg calle.hedberg@gmail.com wrote:
Hi,
Sorry, but if the automatic min-max calculation is simply retrieving the historical minimum and maximum values, then that makes little sense and would have no real value.
I don’t have time to verify it right now, but my assumption has always been that DHIS2 is using a method similar to the one in DHIS 1.4:
The user specify a period to be used for the min/max analysis - typically 12-18 months (longer is better in a stable health establishment environment, but a shorter period might be optimal in areas where patient numbers are changing rapidly)
The average and standard deviation is calculated for each OrganisationUnit and DataElement combination
The min is set to the average minus stdev x constant, max is set to average plus stdev x constant. Typically constants are 1.5 - 2.0 (There is an “Data analysis std dev factor” specified under General Settings - by default set to 2.0. System might be using that).
Best regards
Calle
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:
Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.
Are Max/ Min of a Data Element/ Org Unit exposed in the API ?
R
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:
Hi Rodolfo,
Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.
Regards
On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:
I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776
–
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Hi,
No generic model will fit ALL types of data elements, obviously - but it needs to fit the most typical data elements where you want to USE the min-max to capture e.g. reporting or capturing mistakes. If using a stdev model, you reset any negative values (which is impossible) to zero.
Using the historic min and max in general makes the min-max useless for such pattern recognition, because any outbreak or campaign or disruption in a service for whatever reason will permanently affect the min/max so that they no longer represent the “normal” variation of a data element.
But it seems I was wrong about this being implemented in DHIS2, then. Jason’s comment that “in the past, we have pulled the data values into statistical software, calculated the min/max according to a given model, and then injected them back via SQL” clearly indicates what is required. We can argue over the model or models to be used, but some kind of model depicting normal distribution of values over time is clearly required. you cannot expect average users to be able to pull data values into STATA or SAS or SPSS, calculate min/max usign a given model, and then re-insert the values using SQL.
I’ll write a blue-print for this - either as part of the core or as an app - when I get my head above the water
Regards
Calle
···
On 20 April 2015 at 15:26, Jason Pickering jason.p.pickering@gmail.com wrote:
I think that would work, but as far as I know you cannot import these via the API, but it is pretty trivial do it directly into the database.
You might be able to script it however by calling to something like
Maybe the devs can suggest a better way, but in the past, we have pulled the data values into statistical software, calculated the min/max according to a given model, and then injected them back via SQL.
Regards,
Jason
On Mon, Apr 20, 2015 at 3:22 PM Rodolfo Melia rmelia@knowming.com wrote:
my test also shows that only the min/max across all values of an org unit is use for setting mix/max. As this is unlikely to change any soon, I just want to understand if I can set the min/max values based on my own logic, and then inject it via the API.
–
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 2:14 PM, Jason Pickering jason.p.pickering@gmail.com wrote:
HI,
I think as Prosper says, it is simply the overall max and min, and this can be set by the user, or calculated externally.
Calle, use of standard deviation is problematic for several reasons however, mostly because it makes an assumption that the data is actually normally distributed, which is not really always the case. This may be appropriate for some data elements, but in many cases, it is not an appropriate statistical assumption, and results which we have seen after often zero inflated or follow something more like a logistical distribution (as opposed to a normal distribution). So applying something like a standard deviation may (and does in the case of DHIS2) result in many negative min values. So, although I think the method of DHIS 1.4 may be somewhat better, it still is not really always appropriate, as the assumption of a normal distribution is simply not always warranted.
Regards,
Jason
On Mon, Apr 20, 2015 at 3:02 PM Calle Hedberg calle.hedberg@gmail.com wrote:
Hi,
Sorry, but if the automatic min-max calculation is simply retrieving the historical minimum and maximum values, then that makes little sense and would have no real value.
I don’t have time to verify it right now, but my assumption has always been that DHIS2 is using a method similar to the one in DHIS 1.4:
The user specify a period to be used for the min/max analysis - typically 12-18 months (longer is better in a stable health establishment environment, but a shorter period might be optimal in areas where patient numbers are changing rapidly)
The average and standard deviation is calculated for each OrganisationUnit and DataElement combination
The min is set to the average minus stdev x constant, max is set to average plus stdev x constant. Typically constants are 1.5 - 2.0 (There is an “Data analysis std dev factor” specified under General Settings - by default set to 2.0. System might be using that).
Best regards
Calle
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:
Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.
Are Max/ Min of a Data Element/ Org Unit exposed in the API ?
R
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:
Hi Rodolfo,
Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.
Regards
On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:
I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776
–
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Sure, and that is my point that generation of min/max is not something “Normal users” would do. This is a potentially complex operation, which requires statistical knowledge.
But again, your insistence that it is calculated according to a “normal model”, I think it simply not right. It may work, but it might be worth to confirm whether it is a valid assumption with you data before writing a blueprint!
Regards,
Jason
···
On 20 April 2015 at 15:26, Jason Pickering jason.p.pickering@gmail.com wrote:
I think that would work, but as far as I know you cannot import these via the API, but it is pretty trivial do it directly into the database.
You might be able to script it however by calling to something like
Maybe the devs can suggest a better way, but in the past, we have pulled the data values into statistical software, calculated the min/max according to a given model, and then injected them back via SQL.
Regards,
Jason
On Mon, Apr 20, 2015 at 3:22 PM Rodolfo Melia rmelia@knowming.com wrote:
my test also shows that only the min/max across all values of an org unit is use for setting mix/max. As this is unlikely to change any soon, I just want to understand if I can set the min/max values based on my own logic, and then inject it via the API.
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 2:14 PM, Jason Pickering jason.p.pickering@gmail.com wrote:
HI,
I think as Prosper says, it is simply the overall max and min, and this can be set by the user, or calculated externally.
Calle, use of standard deviation is problematic for several reasons however, mostly because it makes an assumption that the data is actually normally distributed, which is not really always the case. This may be appropriate for some data elements, but in many cases, it is not an appropriate statistical assumption, and results which we have seen after often zero inflated or follow something more like a logistical distribution (as opposed to a normal distribution). So applying something like a standard deviation may (and does in the case of DHIS2) result in many negative min values. So, although I think the method of DHIS 1.4 may be somewhat better, it still is not really always appropriate, as the assumption of a normal distribution is simply not always warranted.
Regards,
Jason
On Mon, Apr 20, 2015 at 3:02 PM Calle Hedberg calle.hedberg@gmail.com wrote:
Hi,
Sorry, but if the automatic min-max calculation is simply retrieving the historical minimum and maximum values, then that makes little sense and would have no real value.
I don’t have time to verify it right now, but my assumption has always been that DHIS2 is using a method similar to the one in DHIS 1.4:
The user specify a period to be used for the min/max analysis - typically 12-18 months (longer is better in a stable health establishment environment, but a shorter period might be optimal in areas where patient numbers are changing rapidly)
The average and standard deviation is calculated for each OrganisationUnit and DataElement combination
The min is set to the average minus stdev x constant, max is set to average plus stdev x constant. Typically constants are 1.5 - 2.0 (There is an “Data analysis std dev factor” specified under General Settings - by default set to 2.0. System might be using that).
Best regards
Calle
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
On 20 April 2015 at 14:49, Rodolfo Melia rmelia@knowming.com wrote:
Thanks Prosper - that makes sense.
I guess that the only way to set a different rule (e.g., Max should be 150% than previous period) the max will need to be set via an app.
Are Max/ Min of a Data Element/ Org Unit exposed in the API ?
R
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 1:29 PM, Prosper BT ptb3000@gmail.com wrote:
Hi Rodolfo,
Used its before and according to the results we got, it takes the ever max and min entered in the selected period for a given dataset.
Regards
On Mon, Apr 20, 2015 at 3:24 PM, Rodolfo Melia rmelia@knowming.com wrote:
I’m trying to understand how does DHIS generates the Min/ Max values for a given Data Element/ Period. The documentation only makes reference to the fact that Max/Min can be set manually or automatically calculated. When automatically calculated: does anyone knows what is the logic used for setting the values? Twice the average of previous periods?
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Prosper Behumbiize, MPH
Phone: +256 414 320076
Cell: +256 772 139037
+256 752 751776
–
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Hi there,
Calle is right here - we do average, then calculate std dev and set the
upper and lower bounds for each value.
We use data from ALL available time periods to calculate this (period org
unit, data element, option combo)
Mind you we should not really debate whether to use standard deviations or
not, rather if we should support additional _distributions_ to better
handle different kinds of data. We currently use the normal distribution
<http://en.wikipedia.org/wiki/Normal_distribution>\.
Rodolfo - supporting min-max in the Web API is a good idea to allow for
third-party tools - feel free to write a blueprint.
regards,
Lars
The number of std dev is by default 2, but can be set as desired from apps
···
settings > general settings.
Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.
https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html
Anyway, support for import via the API would be good.
Regards,
Jason
···
Hi there,
Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.
We use data from ALL available time periods to calculate this (period org unit, data element, option combo)
Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.
Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.
regards,
Lars
Hi
“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”
Here and there and back again
So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.
I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.
So there are still some room for improvement.
Regards
Calle
···
On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:
Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.
https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html
Anyway, support for import via the API would be good.
Regards,
Jason
On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:
–
Hi there,
Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.
We use data from ALL available time periods to calculate this (period org unit, data element, option combo)
Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.
Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.
regards,
Lars
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Thanks everyone. It is good to understand how the max/min is calculated in DHIS (the documentation should be updated with the content of this email). We definitely need other ways to set min/ max values - my case has a specific logic, which needs to be implemented via SQL, until there is a way to push these values via the API.
Can I confirm that the Min/ Max are set by
-
Data Element
-
Org Unit
-
Cat Combo
-
Attribute Combo?
R
···
On Mon, Apr 20, 2015 at 4:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:
Hi
“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”
Here and there and back again
So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.
I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.
So there are still some room for improvement.
Regards
Calle
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:
Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.
https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html
Anyway, support for import via the API would be good.
Regards,
Jason
On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Hi there,
Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.
We use data from ALL available time periods to calculate this (period org unit, data element, option combo)
Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.
Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.
regards,
Lars
The minmax dataelement values are set by:
-
data element
-
org unit
-
category option combo
···
On Mon, Apr 20, 2015 at 5:27 PM, Rodolfo Melia rmelia@knowming.com wrote:
Thanks everyone. It is good to understand how the max/min is calculated in DHIS (the documentation should be updated with the content of this email). We definitely need other ways to set min/ max values - my case has a specific logic, which needs to be implemented via SQL, until there is a way to push these values via the API.
Can I confirm that the Min/ Max are set by
- Data Element
- Org Unit
- Cat Combo
- Attribute Combo?
R
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 4:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:
Hi
“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”
Here and there and back again
So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.
I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.
So there are still some room for improvement.
Regards
Calle
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:
Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.
https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html
Anyway, support for import via the API would be good.
Regards,
Jason
On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Hi there,
Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.
We use data from ALL available time periods to calculate this (period org unit, data element, option combo)
Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.
Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.
regards,
Lars
Is the attribute likely to be added to the mix?
- the logic that IPPF uses for setting the maximum value is different for Attribute 1 and Attribute 2 of a data value.
R
···
On Mon, Apr 20, 2015 at 4:48 PM, Lars Helge Øverland larshelge@gmail.com wrote:
The minmax dataelement values are set by:
- data element
- org unit
- category option combo
Lars
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 5:27 PM, Rodolfo Melia rmelia@knowming.com wrote:
Thanks everyone. It is good to understand how the max/min is calculated in DHIS (the documentation should be updated with the content of this email). We definitely need other ways to set min/ max values - my case has a specific logic, which needs to be implemented via SQL, until there is a way to push these values via the API.
Can I confirm that the Min/ Max are set by
- Data Element
- Org Unit
- Cat Combo
- Attribute Combo?
R
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 4:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:
Hi
“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”
Here and there and back again
So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.
I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.
So there are still some room for improvement.
Regards
Calle
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:
Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.
https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html
Anyway, support for import via the API would be good.
Regards,
Jason
On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Hi there,
Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.
We use data from ALL available time periods to calculate this (period org unit, data element, option combo)
Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.
Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.
regards,
Lars
Hi Rodolfo,
I think including attribute option combo to min-max data element makes sense. The problem of course is when you want a min-max to apply to all attribute option combos - not a specific one - which I can only guess is the main use-case. So it has to be optional. It can of course be done but requires some work and will be slightly complex.
Feel free to write a blueprint about with with some more detail.
best regards,
Lars
···
On Mon, Apr 20, 2015 at 5:50 PM, Rodolfo Melia rmelia@knowming.com wrote:
Is the attribute likely to be added to the mix?
- the logic that IPPF uses for setting the maximum value is different for Attribute 1 and Attribute 2 of a data value.
R
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 4:48 PM, Lars Helge Øverland larshelge@gmail.com wrote:
The minmax dataelement values are set by:
- data element
- org unit
- category option combo
Lars
On Mon, Apr 20, 2015 at 5:27 PM, Rodolfo Melia rmelia@knowming.com wrote:
Thanks everyone. It is good to understand how the max/min is calculated in DHIS (the documentation should be updated with the content of this email). We definitely need other ways to set min/ max values - my case has a specific logic, which needs to be implemented via SQL, until there is a way to push these values via the API.
Can I confirm that the Min/ Max are set by
- Data Element
- Org Unit
- Cat Combo
- Attribute Combo?
R
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
Rodolfo Meliá
*Principal | *rmelia@knowming.com
Skype: rod.melia | +44 777 576 4090 | +1 708 872 7636
On Mon, Apr 20, 2015 at 4:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:
Hi
“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”
Here and there and back again
So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.
I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.
So there are still some room for improvement.
Regards
Calle
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:
Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.
https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html
Anyway, support for import via the API would be good.
Regards,
Jason
On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Hi there,
Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.
We use data from ALL available time periods to calculate this (period org unit, data element, option combo)
Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.
Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.
regards,
Lars
Hi Calle,
I agree it makes sense to have a “from date” for the data values to include in the std dev and average calculation. I have changed it so it now includes data 2 years before the start date of the validation analysis period. I also helps on performance of the validation process.
regards,
Lars
···
On Mon, Apr 20, 2015 at 5:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:
Hi
“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”
Here and there and back again
So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.
I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.
So there are still some room for improvement.
Regards
Calle
On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:
Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.
https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html
Anyway, support for import via the API would be good.
Regards,
Jason
On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Hi there,
Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.
We use data from ALL available time periods to calculate this (period org unit, data element, option combo)
Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.
Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.
regards,
Lars
Lars,
Excellent - thanks for that. Two years is a reasonable default value - we’ve always used 18 months as the default in 1.4, so almost the same.
I would nevertheless argue that
(a) user-defined period, stdev value, and possibly average/median parameters should ideally be specified on a per data element basis;
(b) adding the attribute option combo to the mix is probably required to cater for instances where data is captured for e.g. multiple collaborating NGOs;
(c) tools enabling the specification of said parameters for larger groups of data elements will make it easier to manage.
(d) a cherry on top would be the ability to adjust for typical seasonal fluctuations.
I will try to write a blue-print for something like the above, not a critical need, but a positive step.
Regards
Calle
···
On 3 May 2015 at 14:12, Lars Helge Øverland larshelge@gmail.com wrote:
Hi Calle,
I agree it makes sense to have a “from date” for the data values to include in the std dev and average calculation. I have changed it so it now includes data 2 years before the start date of the validation analysis period. I also helps on performance of the validation process.
regards,
Lars
–
On Mon, Apr 20, 2015 at 5:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:
Hi
“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”
Here and there and back again
So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.
I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.
So there are still some room for improvement.
Regards
Calle
On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:
Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.
https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html
Anyway, support for import via the API would be good.
Regards,
Jason
On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Hi there,
Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.
We use data from ALL available time periods to calculate this (period org unit, data element, option combo)
Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.
Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.
regards,
Lars
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Lars,
By the way - last week I saw that the bug related to OrgUnit counts in indicators is still there. I checked the Sierra Leone demo too, and it’s the same - integrity checks are still showing “org-unit-do not exist” etc.
Is this an indicator bug or an integrity check bug?
Regards
Calle
···
On 3 May 2015 at 21:29, Calle Hedberg calle.hedberg@gmail.com wrote:
Lars,
Excellent - thanks for that. Two years is a reasonable default value - we’ve always used 18 months as the default in 1.4, so almost the same.
I would nevertheless argue that
(a) user-defined period, stdev value, and possibly average/median parameters should ideally be specified on a per data element basis;
(b) adding the attribute option combo to the mix is probably required to cater for instances where data is captured for e.g. multiple collaborating NGOs;
(c) tools enabling the specification of said parameters for larger groups of data elements will make it easier to manage.
(d) a cherry on top would be the ability to adjust for typical seasonal fluctuations.
I will try to write a blue-print for something like the above, not a critical need, but a positive step.
Regards
Calle
–
On 3 May 2015 at 14:12, Lars Helge Øverland larshelge@gmail.com wrote:
Hi Calle,
I agree it makes sense to have a “from date” for the data values to include in the std dev and average calculation. I have changed it so it now includes data 2 years before the start date of the validation analysis period. I also helps on performance of the validation process.
regards,
Lars
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
On Mon, Apr 20, 2015 at 5:02 PM, Calle Hedberg calle.hedberg@gmail.com wrote:
Hi
“Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo).”
Here and there and back again
So I wasn’t off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like “Male condoms distributed” tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant.
I don’t like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise.
So there are still some room for improvement.
Regards
Calle
On 20 April 2015 at 16:15, Jason Pickering jason.p.pickering@gmail.com wrote:
Good. I probably should have known that already, thus why I had to do some statistical analysis outside of DHIS2 to actually calculate reasonable min max. A quick check of the validity of a normal distribution, can be with the skewness and kurtosis , which provide a idea of how “tilted” a given distribution is.
https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html
Anyway, support for import via the API would be good.
Regards,
Jason
On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland larshelge@gmail.com wrote:
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Hi there,
Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value.
We use data from ALL available time periods to calculate this (period org unit, data element, option combo)
Mind you we should not really debate whether to use standard deviations or not, rather if we should support additional distributions to better handle different kinds of data. We currently use the normal distribution.
Rodolfo - supporting min-max in the Web API is a good idea to allow for third-party tools - feel free to write a blueprint.
regards,
Lars
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19274
Email: calle.hedberg@gmail.com
Skype: calle_hedberg