Hi,
I’ve added a new blueprint here:
https://blueprints.launchpad.net/dhis2/+spec/improve-minmax-value-functionality
-which is about improving the min/max validation functionality. The current solution is very basic and not sufficient in many ways. Here are my thoughts on how to improve this. We can use this list for discussion and then update the blueprint when we settle on something concrete.
This is what I wrote in the blueprint:
A few improvements are needed to the min/max value functionality:
- Generation of min/max values should be available from the data administration module
Currently you need to generate min/max ranges for each orgunit/dataset combination one by one in the data entry module. Sometimes you want to generate ranges for all orgunits and datasets at once and then data entry is not the place for this. In Data Administration we can add a new menu heading called “Min/MAx validation” and in there we can allow min/max generation for any combination of orgunit/dataset, and easily allow all combinations to be selected. Maybe also a good idea to include a “from” and “to” field to indicate which periods to use as the basis for the generation, e.g. from 2008-01-01 to 2008-12-31 would indicate that all 12 months of 2008 will be used if the dataset has monthly period type, or the 4 quarters of 2008 will be used if quarterly dataset etc.
- User defined parameters that control how the generation is done. Currently the range values are set to 10% lower than the lowest value and 10% higher than the highest value, which is a very crude method. This does not take care of outliers that might already be in the system.
any suggestions for a better statistical method for this? And on how to make it user defined?
-
I assume we would like to keep the generate min/max option in data entry which can be useful for users that do not deal with all, but just a limited number of orgunits and know that a new round of generation would correct the min/max ranges. But thsi generation should then be configured in a setting, especially how many periods to use. So we could add another property in Data Administration->min/max validation that defines how many periods to use as basis for the generation, for monthly, weekly, yearly etc. period types. Do we need one property per period type? Currently this property is hard-coded to 6 in the source code.
-
Default min/max range per data element
Normally a min/max range is linked to an orgunit/dataelement combination, but sometimes, e.g when there is very little data or very poor data quality in the system it is useful to have a default range that can be used for all orgunits as a first level of validation to avoid typos and crazy outliers. These default values need to be set somewhere, and maybe data set management is the best suited place for this, at least that is where it is located in DHIS 1.4. Here we need some functionality to quickly set these ranges, even as quick as setting the same range for all data elements in a dataset, and then also the possibility to adjust individual data elements in the data (set) element list.
In Data entry the procedure will be to first check whether a min/max range exists for the orgunit/data element (the best option) and if not then load the default range for the data element (the next best option), and if nothing is set then leave it blank (the worst option).
best regards,
Ola Hodne Titlestad
HISP
University of Oslo