Data elements derive their period type from the data sets they are members
Restated (what I just sent Lars only by mistake): a datavalue derives
its period type from the data set of
which its data element is a member
And when they are members of two datasets with different period types they
have multiple period types right?
It's important to remain aware that it is values ultimately which have
periods (and hence period types).
And when you look at a value you can derive its period type in one of
two ways - via dataset or via period. Potentially these could
disagree, The one which derives from its period should be considered
authoritative ie. if the period is 2009-Jan then regardless of what
the dataset might say this really must be monthly. Of course we hope
these always agree. Incidentally the lookup from
datelement-to-dataset-to-period looks like a greater complexity than
the lookup from period->periodType.
The key thing to look out for in data entry and data import is to avoid
overlaps in data values that will cause duplication when aggregating data
E.g. if the SAME ORGUNIT registers values for the same data element for two
different period types that have overlapping periods, e.g. Jan-10 and Q1-10.
Then the aggregate values for Q1-10, Jan-June 2010, and 2010 will all show
an incorrect value since the value for Jan-10 is counted twice.
OK. Thats a good concrete constraint to have.
One way to enforce this constraint is to monitor which datasets an orgunit
is assigned to, and not allow orgunits to be assigned to two datasets that
have the same data element AND different period types.
Agreed, Though this constraint should probably be imposed on forms
rather than datasets.
As far as I am aware,
we are not checking for this today. During data import it could be checked
on data element level by looking up the period type the way Bob has shown,
but that sounds like a lot of look ups and time consuming validation, or?
On data import we don't really validate at all, beyond whatever
constraints the db imposes. For efficiency we simply pop the values in
with multiple insert statement. So this validation would have to
happen as a stage before the actual import or would have to be
constrained within the db. In fact it can't be validated easily
before the import as it is dependent on existing values within the db.
A relatively normal use case that we probably have to find a way to support,
and I think they are struggling with in Vietnam, is that different provinces
can use different period types for the same data elements (even for complete
data sets). E.g. if the national data flow policy says to report on
immunisation data every quarter, so that becomes the minimum requirement for
all provinces. Then some of the provinces decide that all their facilities
have to collect this data monthly anyway, and then at the province level
they simply send the quarterly aggregates to national level (in the
paper-based or Excel world). At the same time other provinces just collect
quarterly data at the facility level as in the minimum national requirement.
At the national level there is a need to consolidate all this data, even
data by the facility level, so ideally a national DHIS database should be
able to store both monthly and quarterly raw data values for the same data
elements, but for different orgunits. The national information users can
then easily generate quarterly reports on immunisation for all provinces,
while in some provinces they can do monthly data analysis if they want to
collect data using that frequency.
We support the above scenario by allowing the same data elements to be
assigned to different data sets with different period types, but we don't
control for misuse of this flexibility which can lead to duplication and
inconsistent aggregated data values as pointed out above.
Thinking further ... I really think the problem arises because we we
have a dataset concept which represents a form and is also used to
constrain periodtypes on dataelements. Thinking of the use case you
have just described, it should be the case that one can have a paper
form which national level expect to collect quarterly, and the same
form be used at a lower level to collect data monthly. If we wanted
to mirror that use case electronically we would have to divorce the
form from the periodtype - ie a form would collect datavalues of a
certain period, but the same form could be used in different orgunits
for collecting data at a different frequency..
So (leaving dataset aside for the moment) if we can't assign a
periodtype to a form and we can't assign to a dataelement and its too
inefficient to validate on a one by one datavalue basis what is a girl
I suspect the correct answer is to refactor datavalue and create a
datavalueset type - note: a set of datavalues rather than a set of
dataelements. Designing out loud, a datavalueset would have the
1. a formid - the collection instrument used - roughly corresponds to
2. an orgunitid - where the datavalues come from
3. a periodid - the period of all the datavalues
couple of other useful attributes I can think of
Datavalue now becomes slightly simpler (which is always a good thing).
It only has:
value, dataelementid, categorycombooption, datasetid
We can relatively efficiently validate that a dataset object is not
persisted which has the same formid, orgunitid and an overlapping
There is no longer any ambiguity about periodtype of a datavalue.
stored_by, timestamp, comment might go either way. Probably they need
to stay on datavalue. I notice comment is rarely used but its really
useful to have a comment on datavalueset for import purposes.
'nuff designing out loud. Got to go.
2010/5/20 Ola Hodne Titlestad <firstname.lastname@example.org>:
2010/5/20 Lars Helge Øverland <email@example.com>
On Thu, May 20, 2010 at 11:44 AM, Ola Hodne Titlestad <firstname.lastname@example.org> >> wrote:
After Kim Anh's email about the use of the same data elements with
different period types I dug up this old discussion from March 2009.
What is the status on this work, or did we not conclude this?
2009/3/20 Bob Jolliffe <email@example.com>
2009/3/20 Lars Helge Øverland <firstname.lastname@example.org>:
>> Yes this is true. But what do you think of the idea to enforce
>> DataSet membership having a default DataSet for all the delinquents?
>> I'm not sure if it can be enforced by the schema, but at least by the
> OK but what does this give us in terms of PeriodType-determining if
> default DataSet has a null PeriodType?
Nothing really. The only effect would be you have an index on the
unassigned DataElements for what its worth. Mainly it would be useful
for determining easily the available DataElements which can be added
to a DataSet. Maybe its a nonsense idea - I was just trying to think
of ways to make editing DataSets reasonably straightforward.
>> I don't know if its about right or wrong. There are pros and cons of
>> both approaches. What you gain on the swings you lose on the
>> In the explicit case the application will have to enforce that
>> members all have the same periodType.
>> In the implicit case the application will have to enforce that
>> DataElements can only be members of multiple groups if these share
>> same PeriodType.
>> The net result as far as the Data API is concerned can and must be
>> same. Perhaps we should define exactly what extra methods we want in
>> the API first. We have already identified a few. Then decide
>> a database change is necessitated by these.
> Yes. We need at least service method:
> Collection<DataElement> getDataElementsByPeriodType( PeriodType )
> and getter on the DataElement object:
> PeriodType getPeriodType()
> I guess we could make a branch, start coding and see how it works out.
Sure. So long as we are adding methods we won't be breaking anything
in terms of backward compatibility. Just enforcing application level
constraints. Then we can really encourage (enforce?) upper layers to
strictly interact with the data via the API. Even if this might
occasionally mean making some lightweight API methods which bypass the
> Another issue would arise in the (exotic) situation where someone
> assigns a
> DataElement to a DataSet, enter data for it, then removes it from the
> DataElement. The data is there, but how do we deal with it in regard
> to the
> mentioned required functionaly (trend analysis, datamart) ?
Yes this gets a bit weird (I presume you mean removes it from the
DataSet). I'm guessing you haven't lost the data because the
dataValues each have a PeriodID which in turn is linked to a
PeriodType. I suppose that (in such an exotic headspace) DataElements
can in fact change their PeriodTypes over time, though I imagine its
not a great idea.
The effect would be the same in the explicit relationship case, if
someone assigns a DataElement to a DataSet, enter data for it, then
changes the PeriodType of the DataElement ...
Mailing list: https://launchpad.net/~dhis2-devs
Post to : email@example.com
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp