The use of dimensions in data entry and data analysis (was: commit message for Rev 938)

olatitle · 30 October 2009 16:35

Perhaps it is a bad example but it raises a good point, and we might

should move this to a new thread if it continues to balloon.

I changed the name of the subject, might be to general, but still better than a reply to a commit message.

My understanding was the category options would be used for data

entry. This is not really an issue about 1.4, it is really an issue

about whether people will enter totals or not. There is nothing to

prevent people from defining a category , Gender, with three (or more)

options, “Male” “Female” and “Total”, and it may be necessary. Let me

explain. On the paper tools used here in Zambia, there is a separate

column “Total” which is the sum of three age groups (Under 1, 1-5 and

Over 5). If I was going to implement the multidimensional data

elements here, if I wanted to replicate the paper tool exactly, I

would need a separate column for totals. This is what we have now, and

it serves a good purpose, as the data entry personnel can see if the

totals provided by the facility actually match the calculated totals.

This raises an interesting point related to the discussion we have had about the role of data sets and data entry forms. To me such a control column like “total” is simply a GUI feature and I don’t think it should be reflected in the data model or persisted.

It would be great if we could add this feature to our data entry module. What I see here is a need for an option to add a total column to each categorycombination and then to automatically populate this field as the other fields of the row gets filled. This is not a new request as it has been mentioned several times (I remember a quite heated discussion about the use of calculated data elements a few years ago), but with a new take on the data set and form relation and a refined multidimensional model this might be a better time to look at this.

And I agree with Bob, to get these totals in a report is a matter of adding this to the GUI somehow, the ability to add total columns for data elements + category combos.

No idea if this is how the categories work in DHIS2. But from the

analysis standpoint, it would seem that you would need some calculated

data element as well that would calculate the total from the

multidimensional components of the data element, unless as you say,

you are going to rely on OLAP or PivotTables to always do this

aggregation for you.

At least for categories and options there should be no need to go to OLAP to get this.

And although more complicated, I would think it should be possible to also extract totals from a data element group set model with a similar logic to what I described earlier. I guess that is the point of the new dimension service which abstract away the difference between categories and group sets, is that correct Lars/Bob?

I would think that actually having the ability to

persist and store the data value, as a calculated data element (Save

calculated) and assign it a Category option of “Total” (which might be

implicit anyway in the system) would make sense, since you might need

it directly in a report or something and do not want to have to revert

to OLAP or custom SQL to get this. But again, I am looking at this

from the perspective of a bunch of data elements which do not use

category options.

You would get the totals as you state, but only by using OLAP. What

about if I want to create an Excel report with only Totals? Now if the

new model will automatically give me the totals from the component

dimensions, great, but I did not see this in the blueprint.

You are right, getting total from the group set/groups part of dimension/dimensionoptions was not covered I think.
We need to add this to the blueprint. The idea was to abstract away the difference between categories and group sets at the point of data analysis, e.g. when defining new report tables, so I guess this means more complexity to the dimension service Lars is working on.

Ola

···

2009/10/30 Jason Pickering jason.p.pickering@gmail.com

I was

assuming that I would need explicitly define a separate, calculated

element for this.

Regards,

Jason

On Fri, Oct 30, 2009 at 5:34 PM, Ola Hodne Titlestad olatitle@gmail.com wrote:

2009/10/30 Jason Pickering jason.p.pickering@gmail.com

OK, I took a walk around the block to think about this a bit more. I

think it does, make sense, sort of. Lets look at “Total”, which might

be defined as a calculated data element, say composed of different age

groups. But the “Total” in this category, would not be the same as the

“Total” that might be defined in a different category, or would it?

I thought the whole point of the category/categoryoption/categorycombo model

was that the total would be the data element itself without any

categoryoption? The “total” should then not be defined as one of the

options, but be always be derived from the sum of all the options.

Your example Jason is from a 1.4 design point of view where you are not

using this model, but normally need calculated data elements to get to a

total (since the categoryoptions are part of the data element names). With

the new data element group set model I guess you can derive the total for

e.g. “Malaria new cases OPD” e.g. by filtering on the data element group

“Malaria” in the group set “Diseases” plus the group called “New cases” in

the group set “Patient status” and then simply sum up all the data elements

in the two groups sets “Gender” and “Morbidity age group”. Would’t such an

approach give you the totals you need?

As in exactly how we could accommodate that within DHIS2 e.g in a report

table GUI I am not sure. Seems complicated and something for an OLAP tool to

take care of.

Ola

Having a single categoryoption “Total” would allow one to slice out

particular groups of dimensional elements, which is a fairly common

operation as Ola mentions, with a single filter statement. Otherwise,

you would need to collect all of the "Total"s for different categories

through another table and perform an inner join, as opposed to a

filter. For multiple category options, I guess there would need to be

a decision made whether to perform an inner join or loop through a

filter, but I guess an inner join would actually be better for either

one or many category options (have not looked at the code). If the

uniqueness contraint is not there, the user would need to select in a

separate step to select all "Total"s and then perform an inner join,

as there would be no intrinsic relationship between “Total” in the

“Age” category and the “Total” in the “Gender” category. This might be

very tedious if there are many categories to select from. Having

multiple category options with the same name does not make sense in

this case, and I think this is what everyone is saying?

Obviously there should not be two category options called “Total” to

be within a single category/data element group set. However,I am not

sure I understand completely your point Ola. To me, the use case you

describe is very typical. "Give me all data for the under 1 age

group", “Give me all data on in patient discharges”. Having to define

multiple “under 1” and “IPD” for each category seems to be very

inefficient, as well as painful.

So, I guess maybe I am answering my own mail…I think.

2009/10/30 Lars Helge Øverland larshelge@gmail.com:

On Fri, Oct 30, 2009 at 2:43 PM, Jason Pickering > > >> > jason.p.pickering@gmail.com wrote:

Could some one remind me once again what the point of having a

category option in two separate categories is? is there a use case

here? It does not seem totally obvious, but maybe I am missing

something.

It might be that there are none. This could be useful in the sense that

if

nobody asks for removing the constraint - we won’t.

Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

bobj · 30 October 2009 17:39

Don’t really have much time to contribute to this discussion right now, but …

Perhaps it is a bad example but it raises a good point, and we might

should move this to a new thread if it continues to balloon.

I changed the name of the subject, might be to general, but still better than a reply to a commit message.

My understanding was the category options would be used for data

entry. This is not really an issue about 1.4, it is really an issue

about whether people will enter totals or not. There is nothing to

prevent people from defining a category , Gender, with three (or more)

options, “Male” “Female” and “Total”, and it may be necessary. Let me

explain. On the paper tools used here in Zambia, there is a separate

column “Total” which is the sum of three age groups (Under 1, 1-5 and

Over 5). If I was going to implement the multidimensional data

elements here, if I wanted to replicate the paper tool exactly, I

would need a separate column for totals. This is what we have now, and

it serves a good purpose, as the data entry personnel can see if the

totals provided by the facility actually match the calculated totals.

This raises an interesting point related to the discussion we have had about the role of data sets and data entry forms. To me such a control column like “total” is simply a GUI feature and I don’t think it should be reflected in the data model or persisted.

It would be great if we could add this feature to our data entry module. What I see here is a need for an option to add a total column to each categorycombination and then to automatically populate this field as the other fields of the row gets filled. This is not a new request as it has been mentioned several times (I remember a quite heated discussion about the use of calculated data elements a few years ago), but with a new take on the data set and form relation and a refined multidimensional model this might be a better time to look at this.

And I agree with Bob, to get these totals in a report is a matter of adding this to the GUI somehow, the ability to add total columns for data elements + category combos.

No idea if this is how the categories work in DHIS2. But from the

analysis standpoint, it would seem that you would need some calculated

data element as well that would calculate the total from the

multidimensional components of the data element, unless as you say,

you are going to rely on OLAP or PivotTables to always do this

aggregation for you.

At least for categories and options there should be no need to go to OLAP to get this.
And although more complicated, I would think it should be possible to also extract totals from a data element group set model with a similar logic to what I described earlier. I guess that is the point of the new dimension service which abstract away the difference between categories and group sets, is that correct Lars/Bob?

My (radical) idea on this is that a GroupSet should actually “BE” a dataelement. Reason comes down to the fact that values have dimensions. And those dimensions can be different depending on the dataelement used.

eg (using shorthand)

Here’s a datavalue in its “raw” form

Now lets say there are groups gender and age defined of which the above is a member. And a groupset Immunization. Then here’s the same datavalue

Now what about that same de, but without the dimensions:

where I guess 105 would be the Total of all the underlying datavalues.

In fact what would be very nice would be to do away with groups/groupsets entirely. Less is more. Just have (calculated?) dataelements which can form hierarchies (like orgunits). We’re not too far from here at the moment. Another little step and we’ll be over the edge.

I’ll think more about this later. Right now in a rush to implement dxf2 parser …

Cheers
Bob

···

2009/10/30 Ola Hodne Titlestad olatitle@gmail.com

2009/10/30 Jason Pickering jason.p.pickering@gmail.com

I would think that actually having the ability to

persist and store the data value, as a calculated data element (Save

calculated) and assign it a Category option of “Total” (which might be

implicit anyway in the system) would make sense, since you might need

it directly in a report or something and do not want to have to revert

to OLAP or custom SQL to get this. But again, I am looking at this

from the perspective of a bunch of data elements which do not use

category options.

You would get the totals as you state, but only by using OLAP. What

about if I want to create an Excel report with only Totals? Now if the

new model will automatically give me the totals from the component

dimensions, great, but I did not see this in the blueprint.

You are right, getting total from the group set/groups part of dimension/dimensionoptions was not covered I think.
We need to add this to the blueprint. The idea was to abstract away the difference between categories and group sets at the point of data analysis, e.g. when defining new report tables, so I guess this means more complexity to the dimension service Lars is working on.

Ola

I was

assuming that I would need explicitly define a separate, calculated

element for this.

Regards,

Jason

On Fri, Oct 30, 2009 at 5:34 PM, Ola Hodne Titlestad olatitle@gmail.com wrote:

2009/10/30 Jason Pickering jason.p.pickering@gmail.com

OK, I took a walk around the block to think about this a bit more. I

think it does, make sense, sort of. Lets look at “Total”, which might

be defined as a calculated data element, say composed of different age

groups. But the “Total” in this category, would not be the same as the

“Total” that might be defined in a different category, or would it?

I thought the whole point of the category/categoryoption/categorycombo model

was that the total would be the data element itself without any

categoryoption? The “total” should then not be defined as one of the

options, but be always be derived from the sum of all the options.

Your example Jason is from a 1.4 design point of view where you are not

using this model, but normally need calculated data elements to get to a

total (since the categoryoptions are part of the data element names). With

the new data element group set model I guess you can derive the total for

e.g. “Malaria new cases OPD” e.g. by filtering on the data element group

“Malaria” in the group set “Diseases” plus the group called “New cases” in

the group set “Patient status” and then simply sum up all the data elements

in the two groups sets “Gender” and “Morbidity age group”. Would’t such an

approach give you the totals you need?

As in exactly how we could accommodate that within DHIS2 e.g in a report

table GUI I am not sure. Seems complicated and something for an OLAP tool to

take care of.

Ola

Having a single categoryoption “Total” would allow one to slice out

particular groups of dimensional elements, which is a fairly common

operation as Ola mentions, with a single filter statement. Otherwise,

you would need to collect all of the "Total"s for different categories

through another table and perform an inner join, as opposed to a

filter. For multiple category options, I guess there would need to be

a decision made whether to perform an inner join or loop through a

filter, but I guess an inner join would actually be better for either

one or many category options (have not looked at the code). If the

uniqueness contraint is not there, the user would need to select in a

separate step to select all "Total"s and then perform an inner join,

as there would be no intrinsic relationship between “Total” in the

“Age” category and the “Total” in the “Gender” category. This might be

very tedious if there are many categories to select from. Having

multiple category options with the same name does not make sense in

this case, and I think this is what everyone is saying?

Obviously there should not be two category options called “Total” to

be within a single category/data element group set. However,I am not

sure I understand completely your point Ola. To me, the use case you

describe is very typical. "Give me all data for the under 1 age

group", “Give me all data on in patient discharges”. Having to define

multiple “under 1” and “IPD” for each category seems to be very

inefficient, as well as painful.

So, I guess maybe I am answering my own mail…I think.

2009/10/30 Lars Helge Øverland larshelge@gmail.com:

On Fri, Oct 30, 2009 at 2:43 PM, Jason Pickering > > > > >> > jason.p.pickering@gmail.com wrote:

Could some one remind me once again what the point of having a

category option in two separate categories is? is there a use case

here? It does not seem totally obvious, but maybe I am missing

something.

It might be that there are none. This could be useful in the sense that

if

nobody asks for removing the constraint - we won’t.

Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

jason · 31 October 2009 19:31

Bob, I think you hit the nail on the head, and this is sort of what I
was getting at in my previous mails, but will try and explain more
here.

First, I think the current implementation of the DHIS aggregation
engine is clearly the way to go when it comes to materializations of
data values, but as Bob hints at, we are not quite there yet.
Calculated data elements and indicators can follow complex aggregation
rules that OLAP does not understand well. This was one of the hard
lessons we learned from the OpenHealth functional prototype. Not all
"multidimensional" data elements follow multidimensional aggregation
rules that OLAP engines are familiar with. The calculated data
element/indicator functionality allows one to define complex rules for
how data elements should be calculated, which OLAP generally is not
capable of handling. SUM, AVG, COUNT work well with OLAP, but factors,
aggregation start levels, and some of the other features of the
aggregation engine have necessitated a custom solution. Mind you,
there is still room for improvement here. I would personally like
more operators (COUNT, STDEV) and ideally an integrated scripting
support to define highly complex indicators/data elements.

The aggregateddatavalue/indicator/report tables are very useful
artefacts to report builders/analysts. I would hate to have to
replicate in SQL/OLAP what the data aggregation service does for me.
This is what procedural languages are for after all. These tables
provide a very useful, albeit one could argue bulky, materialized view
of the data in the routine data/semipermanent data tables. But disk
space is cheap. With proper definition of calculated data elements and
indicator, followed by materilization into report tables via the data
mart, report builders and analysts have very useful, simple tables
they can readily work with.

Get to the point Jason you say! Right. The point is, that data element
group sets (at least as I am seeing them in the blueprints) have an
assumed and implicit aggregation pathway between dimension group set
element/category options. As an example, (Under 1) + (1-<5) + (Over
5) = Total for "Age" perhaps. Why would I not want to define this
relationship explicitly in the form of a calculated data element
instead, when the logic and procedures already exist? How about if I
want the population for the "Under 5" age group? This would be the
sum "Under 1" and "1-<5" right? If I need this value in a report, how
would I get it? Would DHIS automagically know that the "Under 5"
category is a result of the aggregation of two other category options?
It would not seem that it could know, without me defining a calculated
data element and assigning it a category option of "Under 5". Perhaps
I would not want to show this dimension option in the data entry form,
but I might be interested in having in a report or other table for
analysis. Are we going to require that people pull out "Under 1" and
"1-<5" into a PivotTable, perform the aggregation, and import it back
into the DB? No, I would not think this would be the right solution.
So, it would seem to me for this use case, we would need to define
explicitly a calculated data element, with specific aggregation
operations, that would tell me how to add "Under 1" and "1-<5" in
order to get the "Under 5" age group. Thus, I am not sure that the
category options go far enough in allowing me to explicity define how
totals are calcualted. I agree that in many cases, the total will be
the sum of the parts, but not always.

Let me take another example to further clarify my point. Disregard my
previous paragraph for a moment in regards to the age groups. Let us
assume we have population values provided to us by official sources
for "Pop. Under 1" , "Pop. Under 5", "Population 5-15", "Population
15-49", "Population Over 49". Let us assume I would define a category
"Age" and specify options "Under 1", "1-<5", "5-15", "15-49", and
"Over 49". I need "1-<5" for calculation of certain inidcators,
although it has not been provided to me. Let us further assume that I
collect data routine for three age groups "Under 1", "1-<5" and "Over
5". This would imply that if I define a multidimensional data element
for "Total Population" I would need the following rule.

"Total Population" = "Population Under 5" + "Population Under 5-15" +
"Population 15-49" + "Population Over 49"

In this case, the "Total" would not be the sum of the component parts.
Does this exclude me from using the "category" functionality in this
case? Or would I need to somehow exclude the "1-5" age group from the
category, as it is not used in data entry. If so, would I need to
define it as a plain old non-multidimensional calculated data element?
It feels we are missing something here.

In order to calculate the "1-<5" population group coverage rate for a
particular data element, I need to define a calculated data element in
order to get the proper denominator:

"Populatiton " = "Population Under 5" - "Population Under 1".

Note the minus sign there. It must be defined explicitly, which says
to me we cannot always assume that the operator between
category/dimension elements is always a "+". Thus, we cannot simply
assume that we can always add category options up in order to get
another data element. Even "Total" is not a safe bet, as in my
example, I would enter a value that would not be aggregated in order
to obtain the "Total".

How then do we handle the issue of dimensional hierarchies? Well,
with OrgUnit hierarchies, I have the ability to decide how the
aggregation take place, to some degree. Here in Zambia, we allow
districts to enter facility catchment populations, which allow them to
calculate facility coverage rates. The sum of all the catchment
populations of all the facilities in a given district, does not
necessarily add up to the "official" district population figures,
which according to government policy, must be used to calculate
district coverage rates. DHIS allows me to define this explicitly by
deciding where the aggregation of the population figures start, in
our case at the district level.

What about the period hierarchy? Where can I define explicit rules
about how to derive quarterly figures from monthly figures?
Well, in this case ,I would need to define a rule that says that any
data value that has an period attribute of "Jan", "Feb" or "Mar" would
fall into "1st quarter". What about if I use financial quarters
instead of calendar quarters? It feels again that I need the ability
to define aggregation rules within a dimension, to derive either the
total or other values that may not be entered, as well as between
dimensions themselves.

These examples are not completed fabricated. There is a need to be
able to define, explicitly, operators regarding how aggregation takes
place within a dimension/category.When an analyst pulls the data into
a PivotTable, s/he is defining the rules dynamically. However for
reports and other materialized tables, how are we going to materialize
the values and present them in a format that is usable to people not
using external OLAP/analysis tools?

This mail turned out the be a bit longish, but I agree with Bob. We
are close and I think the generalization of the dimension concept is a
definite step in the right direction, but it feels we need to make the
extra push and see if we can get it right.

Best regards,
Jason

···

On Fri, Oct 30, 2009 at 7:39 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

Don't really have much time to contribute to this discussion right now, but
...

2009/10/30 Ola Hodne Titlestad <olatitle@gmail.com>

2009/10/30 Jason Pickering <jason.p.pickering@gmail.com>

Perhaps it is a bad example but it raises a good point, and we might
should move this to a new thread if it continues to balloon.

I changed the name of the subject, might be to general, but still better
than a reply to a commit message.

My understanding was the category options would be used for data
entry. This is not really an issue about 1.4, it is really an issue
about whether people will enter totals or not. There is nothing to
prevent people from defining a category , Gender, with three (or more)
options, "Male" "Female" and "Total", and it may be necessary. Let me
explain. On the paper tools used here in Zambia, there is a separate
column "Total" which is the sum of three age groups (Under 1, 1-5 and
Over 5). If I was going to implement the multidimensional data
elements here, if I wanted to replicate the paper tool exactly, I
would need a separate column for totals. This is what we have now, and
it serves a good purpose, as the data entry personnel can see if the
totals provided by the facility actually match the calculated totals.

This raises an interesting point related to the discussion we have had
about the role of data sets and data entry forms. To me such a control
column like "total" is simply a GUI feature and I don't think it should be
reflected in the data model or persisted.

It would be great if we could add this feature to our data entry module.
What I see here is a need for an option to add a total column to each
categorycombination and then to automatically populate this field as the
other fields of the row gets filled. This is not a new request as it has
been mentioned several times (I remember a quite heated discussion about the
use of calculated data elements a few years ago), but with a new take on the
data set and form relation and a refined multidimensional model this might
be a better time to look at this.

And I agree with Bob, to get these totals in a report is a matter of
adding this to the GUI somehow, the ability to add total columns for data
elements + category combos.

No idea if this is how the categories work in DHIS2. But from the
analysis standpoint, it would seem that you would need some calculated
data element as well that would calculate the total from the
multidimensional components of the data element, unless as you say,
you are going to rely on OLAP or PivotTables to always do this
aggregation for you.

At least for categories and options there should be no need to go to OLAP
to get this.
And although more complicated, I would think it should be possible to also
extract totals from a data element group set model with a similar logic to
what I described earlier. I guess that is the point of the new dimension
service which abstract away the difference between categories and group
sets, is that correct Lars/Bob?

My (radical) idea on this is that a GroupSet should actually "BE" a
dataelement. Reason comes down to the fact that values have dimensions.
And those dimensions can be different depending on the dataelement used.

eg (using shorthand)

Here's a datavalue in its "raw" form
<dv de="Immunization_Male_Under5" Value="5"/>
Now lets say there are groups gender and age defined of which the above is a
member. And a groupset Immunization. Then here's the same datavalue
<dv de="Immunization" gender="M" Age="<5" Value="5"/>
Now what about that same de, but without the dimensions:
<dv de="Immunization" Value="105"/>

where I guess 105 would be the Total of all the underlying datavalues.

In fact what would be very nice would be to do away with groups/groupsets
entirely. Less is more. Just have (calculated?) dataelements which can
form hierarchies (like orgunits). We're not too far from here at the
moment. Another little step and we'll be over the edge.

I'll think more about this later. Right now in a rush to implement dxf2
parser ...

Cheers
Bob

I would think that actually having the ability to
persist and store the data value, as a calculated data element (Save
calculated) and assign it a Category option of "Total" (which might be
implicit anyway in the system) would make sense, since you might need
it directly in a report or something and do not want to have to revert
to OLAP or custom SQL to get this. But again, I am looking at this
from the perspective of a bunch of data elements which do not use
category options.

You would get the totals as you state, but only by using OLAP. What
about if I want to create an Excel report with only Totals? Now if the
new model will automatically give me the totals from the component
dimensions, great, but I did not see this in the blueprint.

You are right, getting total from the group set/groups part of
dimension/dimensionoptions was not covered I think.
We need to add this to the blueprint. The idea was to abstract away the
difference between categories and group sets at the point of data analysis,
e.g. when defining new report tables, so I guess this means more complexity
to the dimension service Lars is working on.

Ola
---------

I was
assuming that I would need explicitly define a separate, calculated
element for this.

Regards,
Jason

On Fri, Oct 30, 2009 at 5:34 PM, Ola Hodne Titlestad <olatitle@gmail.com> >>> wrote:
> 2009/10/30 Jason Pickering <jason.p.pickering@gmail.com>
>>
>> OK, I took a walk around the block to think about this a bit more. I
>> think it does, make sense, sort of. Lets look at "Total", which might
>> be defined as a calculated data element, say composed of different age
>> groups. But the "Total" in this category, would not be the same as the
>> "Total" that might be defined in a different category, or would it?
>>
>
> I thought the whole point of the category/categoryoption/categorycombo
> model
> was that the total would be the data element itself without any
> categoryoption? The "total" should then not be defined as one of the
> options, but be always be derived from the sum of all the options.
>
> Your example Jason is from a 1.4 design point of view where you are not
> using this model, but normally need calculated data elements to get to
> a
> total (since the categoryoptions are part of the data element names).
> With
> the new data element group set model I guess you can derive the total
> for
> e.g. "Malaria new cases OPD" e.g. by filtering on the data element
> group
> "Malaria" in the group set "Diseases" plus the group called "New cases"
> in
> the group set "Patient status" and then simply sum up all the data
> elements
> in the two groups sets "Gender" and "Morbidity age group". Would't such
> an
> approach give you the totals you need?
>
> As in exactly how we could accommodate that within DHIS2 e.g in a
> report
> table GUI I am not sure. Seems complicated and something for an OLAP
> tool to
> take care of.
>
> Ola
> -----------
>
>> Having a single categoryoption "Total" would allow one to slice out
>> particular groups of dimensional elements, which is a fairly common
>> operation as Ola mentions, with a single filter statement. Otherwise,
>> you would need to collect all of the "Total"s for different categories
>> through another table and perform an inner join, as opposed to a
>> filter. For multiple category options, I guess there would need to be
>> a decision made whether to perform an inner join or loop through a
>> filter, but I guess an inner join would actually be better for either
>> one or many category options (have not looked at the code). If the
>> uniqueness contraint is not there, the user would need to select in a
>> separate step to select all "Total"s and then perform an inner join,
>> as there would be no intrinsic relationship between "Total" in the
>> "Age" category and the "Total" in the "Gender" category. This might be
>> very tedious if there are many categories to select from. Having
>> multiple category options with the same name does not make sense in
>> this case, and I think this is what everyone is saying?
>>
>>
>>
>> Obviously there should not be two category options called "Total" to
>> be within a single category/data element group set. However,I am not
>> sure I understand completely your point Ola. To me, the use case you
>> describe is very typical. "Give me all data for the under 1 age
>> group", "Give me all data on in patient discharges". Having to define
>> multiple "under 1" and "IPD" for each category seems to be very
>> inefficient, as well as painful.
>>
>> So, I guess maybe I am answering my own mail...I think.
>>
>>
>>
>>
>> 2009/10/30 Lars Helge Øverland <larshelge@gmail.com>:
>> >
>> >
>> > On Fri, Oct 30, 2009 at 2:43 PM, Jason Pickering >>> >> > <jason.p.pickering@gmail.com> wrote:
>> >>
>> >> Could some one remind me once again what the point of having a
>> >> category option in two separate categories is? is there a use case
>> >> here? It does not seem totally obvious, but maybe I am missing
>> >> something.
>> >>
>> >
>> > It might be that there are none. This could be useful in the sense
>> > that
>> > if
>> > nobody asks for removing the constraint - we won't.
>> >
>> >
>> >
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~dhis2-devs
>> Post to : dhis2-devs@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~dhis2-devs
>> More help : https://help.launchpad.net/ListHelp
>
>

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp

Lars · 5 November 2009 17:19

I have one more question: Would it be acceptable that a DataElementGroup only can be a member of one DataElementGroupSet (1-n) ?

The reason for asking is that it would simplify the implementation of report tables and make the groupset and category approach more aligned.

Lars

Lars · 6 November 2009 10:34

I starting to think this is necessary. Will implement it. Can change it later if objections emerge.

···

2009/11/5 Lars Helge Øverland larshelge@gmail.com

I have one more question: Would it be acceptable that a DataElementGroup only can be a member of one DataElementGroupSet (1-n) ?

The reason for asking is that it would simplify the implementation of report tables and make the groupset and category approach more aligned.

Lars

Lars · 6 November 2009 12:15

The models are now pretty well aligned. I have made a working document here:

https://docs.google.com/Doc?docid=0AfhDQuGFI-A_ZGNjNW1waHhfMThkOWtiamdndA&hl=no

···

2009/11/6 Lars Helge Øverland larshelge@gmail.com

2009/11/5 Lars Helge Øverland larshelge@gmail.com

I have one more question: Would it be acceptable that a DataElementGroup only can be a member of one DataElementGroupSet (1-n) ?

The reason for asking is that it would simplify the implementation of report tables and make the groupset and category approach more aligned.

Lars

I starting to think this is necessary. Will implement it. Can change it later if objections emerge.