Aggregation in the datamart and orgunit level specification

Hi there. I have made a couple of observations that I would like to
bring up to the devs. Currently, one can specify in the definition of
data elements and indicators the aggregation levels that should result
from a data mart export. It would appear that if no data aggregation
levels are marked, no data is exported. This is the desired behavior,
but I have data imported from DHIS 14 and it would seem that the
aggregation levels that we have defined in 1.4 were not carried into
2.0. I am not sure exactly which revision this took place in, but
today I was using the one of the latest revision, around 1486 and data
mart exports were failing for two reasons 1) there were no data
elements defined for export (only indicators) see my bug report from
today and 2) the aggregation levels had not been properly defined for
the indicators which were chosen. I reverted back to a revision ( I
think 1442 and everything worked) In DHIS1.4, data is exported to the
data mart based on the aggregation levels defined for the data
element. In 2.0, the functionality is slightly different as it allows
user to chose aggregation levels that should be exported. If these
aggregation levels have not been defined in the indicator/data
element, they will not export. These are not really bugs, but they are
confusing. I believe there should at least be a warning to the user
that they have chosen aggregation levels that are invalid for the
chosen data element/indicator combinations. In 1.4, there are a bit
fewer choices, as I believe the data elements/indicator are aggregated
for all levels that have been defined in the data elements/indicators
themselves. I am wondering if it is really necessary to have the
ability to choose the orgunit levels that should be exported in a data
mart operation, as these have already been defined in the definition
of the data element and indicator. Another strategy to deal with this
would be a data integrity check, which would warn the user that
orgunit levels have been selected for a data mart export, without any
corresponding aggregation levels in the definition itself. Perhaps
these checks are already there, and I simply just did not run them. If
so, please excuse this mail. :slight_smile: However the point about not importing
1.4 aggregation levels, I think it still valid regardless.

Right, hope this is clear.

Best regards,
Jason

Hi Jason.

It seems you have misunderstood how the datamart export works in DHIS2. I can see how the 1.4 way of doing datamart export has confused you, but the datamart export is quite different in DHIS2.

The aggregation levels in data element definitions (and there are no such thing in indicators) are related to the start level for aggregation, indicating what is the source level for aggregations. These do not control or have anything to do with the orgunit levels for orgunits in the datamart.

Which orgunits that get exported to datamart (the two tables agggregateddatavalue and aggregatedindicatorvalue) are ONLY controlled through the datamart export window, and there you define this per orgunit, not per orgunit level. There is a filter using orgunit level, but the orgunits that are selected are the only ones that end up in the datamart. Same for data elements and indicators.

Every time you do a datamart you can change which orgunits to export data for. The data values will automatically be aggregated up to the orgunit that is selected, no matter what level. To simplify this selection process you can save a datamart export, which basically means saving the parameters (no data) so that you can run the same at a later point without re-selecting everything. New months need to be added though, while orgunits, data elements and indicators usually stay the same.

I know it is very different in 1.4 where you specify in the data element and indicator definitions to which orgunit levels the datamart should export to. In DHIS this is completely decoupled from the data element and indicator definitions and up to any user to define which orgunits (at any level) they want to see aggregated data for.

So to get the datamart process to work for your database you need to create new datamart exports where you typically select all orgunits at a specific level and all data elements and indicators as needed.

E.g. for Sierra Leone we have set up a series of datamart exports:

  • PHU all indicators

  • Chiefdom (subdistrict) all indicators

  • District all indicators

  • PHU morbidity and mortality raw data

  • PHU EPI and nutrition raw data

  • PHU HIV data

  • PHU RCH data

  • Chiefdom Morbidity and Mortality data

  • etc.

These exports are then run every month when there is new data. Then you have to edit the selected periods and run the export one by one.

Johan can give you more detailed explanations on how this is used in SL.

I have experienced (and reported here in the list) that there is a limit of about 250 data element per export (due to postgres limitations of crosstabbing columns), that is why we split up the raw data export by program.

Ideally all datamarts should be run as batch jobs when there is new data and nobody is using the system, at least in a server setup.

Hope this was clarifying.

Ola

···

On 24 February 2010 18:21, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi there. I have made a couple of observations that I would like to

bring up to the devs. Currently, one can specify in the definition of

data elements and indicators the aggregation levels that should result

from a data mart export. It would appear that if no data aggregation

levels are marked, no data is exported. This is the desired behavior,

but I have data imported from DHIS 14 and it would seem that the

aggregation levels that we have defined in 1.4 were not carried into

2.0. I am not sure exactly which revision this took place in, but

today I was using the one of the latest revision, around 1486 and data

mart exports were failing for two reasons 1) there were no data

elements defined for export (only indicators) see my bug report from

today and 2) the aggregation levels had not been properly defined for

the indicators which were chosen. I reverted back to a revision ( I

think 1442 and everything worked) In DHIS1.4, data is exported to the

data mart based on the aggregation levels defined for the data

element. In 2.0, the functionality is slightly different as it allows

user to chose aggregation levels that should be exported. If these

aggregation levels have not been defined in the indicator/data

element, they will not export. These are not really bugs, but they are

confusing. I believe there should at least be a warning to the user

that they have chosen aggregation levels that are invalid for the

chosen data element/indicator combinations. In 1.4, there are a bit

fewer choices, as I believe the data elements/indicator are aggregated

for all levels that have been defined in the data elements/indicators

themselves. I am wondering if it is really necessary to have the

ability to choose the orgunit levels that should be exported in a data

mart operation, as these have already been defined in the definition

of the data element and indicator. Another strategy to deal with this

would be a data integrity check, which would warn the user that

orgunit levels have been selected for a data mart export, without any

corresponding aggregation levels in the definition itself. Perhaps

these checks are already there, and I simply just did not run them. If

so, please excuse this mail. :slight_smile: However the point about not importing

1.4 aggregation levels, I think it still valid regardless.

Right, hope this is clear.

Best regards,

Jason


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Hi Ola,
Thanks for the clarification. It has been a very long and frustrating
day, and I was ready to thrown in the towel today after I experienced
a slew of problems. I will need to try tomorrow after some sleep, to
detail the problem more clearly that I experienced today during the
export.

Somehow though, I still feel there is something that is not clear here
somehow. Perhaps it was the terminology "Available aggregation
levels" which is actually, as you explain it, the aggregation start
level, and not "Available aggregation levels". Is this correct? Here
for instance, we have a figure "Population". Facility catchment
populations are recorded here, but they are not aggregated to the
district level, which means that facility level indicators should use
population values entered at the facility level. District level
indicators should use district level populations (OU3) that have been
entered, but facility level (OU4) should not be aggregated to OU4. So,
what should I choose in the "Available aggregation levels" in this
case?

Totally agree about the batch running of the data mart. It would be
great to be able to 1) schedule it 2) trigger the process via a script
that calls a specific URL.

However, what is the point of crosstabbing on data data elements? This
seems to be very strange indeed to me. Shouldn't this be a function of
a view to worry about this?

I will spend more time with Johan to figure this out tomorrow. Thanks
for the clarification (and please document this!!!!!)

Regards,
Jason

···

On Wed, Feb 24, 2010 at 7:50 PM, Ola Hodne Titlestad <olatitle@gmail.com> wrote:

Hi Jason.

It seems you have misunderstood how the datamart export works in DHIS2. I
can see how the 1.4 way of doing datamart export has confused you, but the
datamart export is quite different in DHIS2.

The aggregation levels in data element definitions (and there are no such
thing in indicators) are related to the start level for aggregation,
indicating what is the source level for aggregations. These do not control
or have anything to do with the orgunit levels for orgunits in the datamart.

Which orgunits that get exported to datamart (the two tables
agggregateddatavalue and aggregatedindicatorvalue) are ONLY controlled
through the datamart export window, and there you define this per orgunit,
not per orgunit level. There is a filter using orgunit level, but the
orgunits that are selected are the only ones that end up in the datamart.
Same for data elements and indicators.

Every time you do a datamart you can change which orgunits to export data
for. The data values will automatically be aggregated up to the orgunit that
is selected, no matter what level. To simplify this selection process you
can save a datamart export, which basically means saving the parameters (no
data) so that you can run the same at a later point without re-selecting
everything. New months need to be added though, while orgunits, data
elements and indicators usually stay the same.

I know it is very different in 1.4 where you specify in the data element and
indicator definitions to which orgunit levels the datamart should export to.
In DHIS this is completely decoupled from the data element and indicator
definitions and up to any user to define which orgunits (at any level) they
want to see aggregated data for.

So to get the datamart process to work for your database you need to create
new datamart exports where you typically select all orgunits at a specific
level and all data elements and indicators as needed.

E.g. for Sierra Leone we have set up a series of datamart exports:
- PHU all indicators
- Chiefdom (subdistrict) all indicators
- District all indicators
- PHU morbidity and mortality raw data
- PHU EPI and nutrition raw data
- PHU HIV data
- PHU RCH data
- Chiefdom Morbidity and Mortality data
- etc.

These exports are then run every month when there is new data. Then you have
to edit the selected periods and run the export one by one.

Johan can give you more detailed explanations on how this is used in SL.

I have experienced (and reported here in the list) that there is a limit of
about 250 data element per export (due to postgres limitations of
crosstabbing columns), that is why we split up the raw data export by
program.

Ideally all datamarts should be run as batch jobs when there is new data and
nobody is using the system, at least in a server setup.

Hope this was clarifying.

Ola
-------------

On 24 February 2010 18:21, Jason Pickering <jason.p.pickering@gmail.com> > wrote:

Hi there. I have made a couple of observations that I would like to
bring up to the devs. Currently, one can specify in the definition of
data elements and indicators the aggregation levels that should result
from a data mart export. It would appear that if no data aggregation
levels are marked, no data is exported. This is the desired behavior,
but I have data imported from DHIS 14 and it would seem that the
aggregation levels that we have defined in 1.4 were not carried into
2.0. I am not sure exactly which revision this took place in, but
today I was using the one of the latest revision, around 1486 and data
mart exports were failing for two reasons 1) there were no data
elements defined for export (only indicators) see my bug report from
today and 2) the aggregation levels had not been properly defined for
the indicators which were chosen. I reverted back to a revision ( I
think 1442 and everything worked) In DHIS1.4, data is exported to the
data mart based on the aggregation levels defined for the data
element. In 2.0, the functionality is slightly different as it allows
user to chose aggregation levels that should be exported. If these
aggregation levels have not been defined in the indicator/data
element, they will not export. These are not really bugs, but they are
confusing. I believe there should at least be a warning to the user
that they have chosen aggregation levels that are invalid for the
chosen data element/indicator combinations. In 1.4, there are a bit
fewer choices, as I believe the data elements/indicator are aggregated
for all levels that have been defined in the data elements/indicators
themselves. I am wondering if it is really necessary to have the
ability to choose the orgunit levels that should be exported in a data
mart operation, as these have already been defined in the definition
of the data element and indicator. Another strategy to deal with this
would be a data integrity check, which would warn the user that
orgunit levels have been selected for a data mart export, without any
corresponding aggregation levels in the definition itself. Perhaps
these checks are already there, and I simply just did not run them. If
so, please excuse this mail. :slight_smile: However the point about not importing
1.4 aggregation levels, I think it still valid regardless.

Right, hope this is clear.

Best regards,
Jason

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp