We have had a lot of discussion about this in Nigeria about this, and for them, saving zeros is VERY important. This is because of local operating and auditing procedures. The complete button is simply not enough for them. Just another use case.
However from the statistical perspective, there are situations (in particular frequency analyses) where the zeros become important. We have needed to coalesce zeros, where they should be there, namely if a facility has reported on a data element in the past, but did not report on it in a particular month, it is assumed that that value SHOULD be zero, even if they did not report it, and even if the “Save zeros” function is active. Again, this is a local assumption. It is also hard to enforce this. This is why I personally think that even though the intention of the “Save zeros” function is good (decrease data entry, increase database performance) one must be very careful how it is implemented in particular situations. Most statistical software (R, Stata, etc) also require values to be zero, or assumed to be zero if they are not present, otherwise, aggregation may not result in what you assume it to be.
The much bigger problem is with validation rules, where NULLS really are NULLS, and not zeros. In DHIS 1.4, there is the notion of “compulsory pairs”, namely data elements which should be entered together. A good example of this is “Number of pregnant women who received PMTCT” (numerator) and “Number of pregnant women who tested positive for HIV” (denominator). For the indicator “Percentage of women who received PMTCT”, we require both the numerator and denominator to calculate the indicator at the level of data entry. The data validation rule for this is something like “The number of women who receive PMTCT must be greater than or equal to the number of HIV+ pregnant women”. However, if data is entered for the numerator and not the denominator (and the denominator is zero), and zeros are not saved, a data validation error will not result. If they are saved, a data validation error will result. So, you must be very careful about situations when zeros really are not significant and how you apply the compulsory data elements, especially if data validation rules for these data elements are involved. I personally think that if a data element is involved in a validation rule, and no data is entered, it must be assumed to be a zero or should result in an error outright, but this seems not to be the case right now
From a database and statistical perspective, I see a very big difference between a NULL and a zero (see archives for more of my rants on this). DHIS2 seems to think that they are the same in certain situations, but they are not always treated the same. My recommendation would be to save them, and partition them out into a separate table if they really become a problem and start to impact performance.
Regards,
Jason
···
On Thu, Nov 10, 2011 at 4:13 PM, Murod Latifov mlatifov@gmail.com wrote:
I meant if there is 0 value indicator at some orgunit, its absence may
affect value of indicator for “all orgunits” in that hierarchy.
On Thu, Nov 10, 2011 at 7:08 PM, Ola Hodne Titlestad olati@ifi.uio.no > wrote:
On 10 November 2011 14:57, Murod Latifov mlatifov@gmail.com wrote:
You are right Ola, it is mostly indicators’ case with regards to average
values, not data elements and saving zero values in most cases is
inefficient.
murod
Yes, and currently DHIS is not using “number of facilities that reported”
as e.g. denominator in the indicator formulas, so as far as I can see, this
doesn’t really affect the indicator values either. Or how do you mean?
We have discussed to the possibility of including a variable in indicator
formulas that provides this number, and when we do, it might be more
relevant to keep track of at least some of the '0’s.
Ola
On Thu, Nov 10, 2011 at 3:23 PM, Ola Hodne Titlestad olati@ifi.uio.no >>> wrote:
That’s true Murod, but most of the data elements collected will have an
aggregation operator “SUM” and never be averaged (in DHIS at least).
Typically the average data elements such as population estimates are never
‘0’, so this has never been an issue for data processing.
When people have reacted to the ignoring of ‘0’ values it has mostly
been related to completeness issues. As Jason says the complete button can
solve that problem.
You can also define certain data elements (called compulsory) to be
filled for a a dataset to be considered complete.
Note that the default behaviour in DHIS is to ignore the '0’s. If you
want to save them, then you must set ‘Store Zero Data Value’ to “Yes” for
each data element where you want this behaviour.
Ola
Ola Hodne Titlestad (Mr)
HISP
Department of Informatics
University of Oslo
Mobile: +47 48069736
Home address: Vetlandsvn. 95B, 0685 Oslo, Norway. Googlemaps link
On 10 November 2011 11:08, Murod Latifov mlatifov@gmail.com wrote:
Hi Mark,
Ignoring true zeros will affect average figure. e.g. division by number
of occurence will be wrong if you didn’t save zero.
regards,
murod
On Thu, Nov 10, 2011 at 11:52 AM, Muhire Andrew >>>>> muhireandrew@yahoo.com wrote:
Dear Mark,
Most people takes zero as a data, that is also the way i believe but
in other side there are people who dont like zero`s to appear in their
database in that case they dont store it in their data (durin the creation
of data element ) . But you have to think also on this: zero is different
from blank fields(Because if you ignore zero, data quality for completeness
of the data fields will be more complicated because here you see all filled
fields and unfilled fields with percentage).So this means when you activate
" dont store zero", the system will automatically ignore zero`s.
Note that : This can also depend on the data management protocols of
the institution.
Thanks
Muhire Andrew
HMIS/Ministry of Health
God is my provider.
From: Mark Spohr mhspohr@gmail.com
To: dhis2-users@lists.launchpad.net
Sent: Thursday, November 10, 2011 2:35 AM
Subject: [Dhis2-users] DHIS zeros and saving data?
Hi,
We just started data entry and things are going well and they like the
data entry.
I just have a few questions.
Zeros:
- All of the data items are set to “Don’t store zeros”… what does
that mean?
When totaling, aggregating, averaging, etc. will these data fields be
counted as zeros?
Related… on data entry I have them skipping over zero fields leaving
them blank… I assume this will be assumed to be a zero?
Also, there doesn’t seem to be an explicit “SAVE FORM” but the data
does appear to be saved (even though the User General Settings does NOT have
the “Auto-save data entry form” box checked). How does this work?
Thanks for this great software!
Mark Spohr, MD
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp