fixing what's broke without breaking what's fixed - categoryoptioncombos

I've a problem which I'd welcome some input on. And which is at least
peripherally related to the mail Knut has just sent re case sensitive
name matching (MALE != Male).

Its about fixing the categoryoptioncombos and bringing in the notion
of concepts, which has already been discussed at length elsewhere and
is blueprinted and tagged here:
https://blueprints.launchpad.net/dhis2/+spec/extend-category-and-groupsets-model-with-concepts

Whereas implementing this is fairly straightforward I'm a bit unsure
how to manage the updating of existing databases which have already
built up an impressive array of aliases for things like '<5'.

One aim is to get rid of, or rather merge, categoryoptions like "<5",
" <5", "HIV_under5", "TB_under5" etc. Currently we know these things
are there because of the problem that categoryoptions must be unique
and cannot appear in more than one category, causing implementors to
invent various ways of saying under 5.

The proposal to fix this is to have a single categoryoption '<5' with
a concept AGE and various categories with concept AGE (ie various
lists of age groups, but drawing from the same pool of options with
concept AGE). This should work well and will cause no problems with
new databases and in the absence of any existing datavalues. But ....
the problem I see is that it is going to be quite difficult to make
these adjustments and mergers on a live database with existing
datavalues mapped to existing categorycombos. ie. repairing the
damage of the past 12 months.

Imagine I have 2 dataelements, de1 with category HIV_age which
includes the option HIV_under5, and de2 with category Malaria_age
which includes the categoryoption 'Malaria_under5'. Basically I want
to be able to merge "HIV_under5" with "Malaria_under5" to just create
"under5". Then we are back to some form of sanity. And worse, I need
users to be able to do this - it can't be doe automatically. The
problem is that to do this will require quite a significant amount of
background action - deleting existing categoryoptions, rebuilding new
categoryoptioncombo ids and mapping/updating the old to the new
categoryoptioncombo ids in the affected datavalues. Not a job for the
fainthearted. Do we have any background experience in such fiddling
with existing categoryoptions (other than manually)?

It seems that what we require here is some sort of categoryoptioncombo
merge functionality in the category service? Or is this long term
functionality we need in dhis at all? Is it better solved by an
external fixer-upper tool?

Bob

I still keep saying this again and again. the biggest mistake is when options are restricted to appear only in one category.

for me, the options are simply units that users could apply them to measure/count data along a specific dimension/category - could be age, or gender or stock, hospital ward,… or any other dimension users think makes sense for them.

I don’t see the need to have multiple <5 units whether we are talking malaria or hiv…actually whatever unit we are talking it all comes for primary registers where we have the actual source of data as name based individuals. and I don’t understand it why we pick different <5 units when refering the same individual say for example (of course unfortunately) who is in both tb and hiv registers.

···

On Tue, Sep 20, 2011 at 3:20 PM, Bob Jolliffe bobjolliffe@gmail.com wrote:

I’ve a problem which I’d welcome some input on. And which is at least
peripherally related to the mail Knut has just sent re case sensitive

name matching (MALE != Male).

Its about fixing the categoryoptioncombos and bringing in the notion
of concepts, which has already been discussed at length elsewhere and
is blueprinted and tagged here:
https://blueprints.launchpad.net/dhis2/+spec/extend-category-and-groupsets-model-with-concepts

Whereas implementing this is fairly straightforward I’m a bit unsure
how to manage the updating of existing databases which have already
built up an impressive array of aliases for things like ‘<5’.

One aim is to get rid of, or rather merge, categoryoptions like “<5”,
" <5", “HIV_under5”, “TB_under5” etc. Currently we know these things
are there because of the problem that categoryoptions must be unique

and cannot appear in more than one category, causing implementors to
invent various ways of saying under 5.

The proposal to fix this is to have a single categoryoption ‘<5’ with
a concept AGE and various categories with concept AGE (ie various

lists of age groups, but drawing from the same pool of options with
concept AGE). This should work well and will cause no problems with
new databases and in the absence of any existing datavalues. But …
the problem I see is that it is going to be quite difficult to make

these adjustments and mergers on a live database with existing
datavalues mapped to existing categorycombos. ie. repairing the
damage of the past 12 months.

Imagine I have 2 dataelements, de1 with category HIV_age which

includes the option HIV_under5, and de2 with category Malaria_age
which includes the categoryoption ‘Malaria_under5’. Basically I want
to be able to merge “HIV_under5” with “Malaria_under5” to just create

“under5”. Then we are back to some form of sanity. And worse, I need
users to be able to do this - it can’t be doe automatically. The
problem is that to do this will require quite a significant amount of

background action - deleting existing categoryoptions, rebuilding new
categoryoptioncombo ids and mapping/updating the old to the new
categoryoptioncombo ids in the affected datavalues. Not a job for the
fainthearted. Do we have any background experience in such fiddling

with existing categoryoptions (other than manually)?

It seems that what we require here is some sort of categoryoptioncombo
merge functionality in the category service? Or is this long term
functionality we need in dhis at all? Is it better solved by an

external fixer-upper tool?

Bob


Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp

I still keep saying this again and again. the biggest mistake is
when options are restricted to appear only in one category.

Abyot I am not disagreeing with you. Not sure what the biggest
mistake is but that certainly was a mistake. The question is how to
best to fix it.

···

On 20 September 2011 14:50, Abyot Gizaw <abyota@gmail.com> wrote:

for me, the options are simply units that users could apply them to
measure/count data along a specific dimension/category - could be age, or
gender or stock, hospital ward,.... or any other dimension users think makes
sense for them.

I don't see the need to have multiple <5 units whether we are talking
malaria or hiv.....actually whatever unit we are talking it all comes for
primary registers where we have the actual source of data as name
based individuals. and I don't understand it why we pick different <5 units
when refering the same individual say for example (of course unfortunately)
who is in both tb and hiv registers.
On Tue, Sep 20, 2011 at 3:20 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

I've a problem which I'd welcome some input on. And which is at least
peripherally related to the mail Knut has just sent re case sensitive
name matching (MALE != Male).

Its about fixing the categoryoptioncombos and bringing in the notion
of concepts, which has already been discussed at length elsewhere and
is blueprinted and tagged here:

https://blueprints.launchpad.net/dhis2/+spec/extend-category-and-groupsets-model-with-concepts

Whereas implementing this is fairly straightforward I'm a bit unsure
how to manage the updating of existing databases which have already
built up an impressive array of aliases for things like '<5'.

One aim is to get rid of, or rather merge, categoryoptions like "<5",
" <5", "HIV_under5", "TB_under5" etc. Currently we know these things
are there because of the problem that categoryoptions must be unique
and cannot appear in more than one category, causing implementors to
invent various ways of saying under 5.

The proposal to fix this is to have a single categoryoption '<5' with
a concept AGE and various categories with concept AGE (ie various
lists of age groups, but drawing from the same pool of options with
concept AGE). This should work well and will cause no problems with
new databases and in the absence of any existing datavalues. But ....
the problem I see is that it is going to be quite difficult to make
these adjustments and mergers on a live database with existing
datavalues mapped to existing categorycombos. ie. repairing the
damage of the past 12 months.

Imagine I have 2 dataelements, de1 with category HIV_age which
includes the option HIV_under5, and de2 with category Malaria_age
which includes the categoryoption 'Malaria_under5'. Basically I want
to be able to merge "HIV_under5" with "Malaria_under5" to just create
"under5". Then we are back to some form of sanity. And worse, I need
users to be able to do this - it can't be doe automatically. The
problem is that to do this will require quite a significant amount of
background action - deleting existing categoryoptions, rebuilding new
categoryoptioncombo ids and mapping/updating the old to the new
categoryoptioncombo ids in the affected datavalues. Not a job for the
fainthearted. Do we have any background experience in such fiddling
with existing categoryoptions (other than manually)?

It seems that what we require here is some sort of categoryoptioncombo
merge functionality in the category service? Or is this long term
functionality we need in dhis at all? Is it better solved by an
external fixer-upper tool?

Bob

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp

Bob, I called it "biggest" in light of the name based module which for
me is the source for aggregate figures. The solution for me is to put
the n-n relation back.

···

On 9/20/11, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

On 20 September 2011 14:50, Abyot Gizaw <abyota@gmail.com> wrote:

I still keep saying this again and again. the biggest mistake is
when options are restricted to appear only in one category.

Abyot I am not disagreeing with you. Not sure what the biggest
mistake is but that certainly was a mistake. The question is how to
best to fix it.

for me, the options are simply units that users could apply them to
measure/count data along a specific dimension/category - could be age, or
gender or stock, hospital ward,.... or any other dimension users think
makes
sense for them.

I don't see the need to have multiple <5 units whether we are talking
malaria or hiv.....actually whatever unit we are talking it all comes for
primary registers where we have the actual source of data as name
based individuals. and I don't understand it why we pick different <5
units
when refering the same individual say for example (of course
unfortunately)
who is in both tb and hiv registers.
On Tue, Sep 20, 2011 at 3:20 PM, Bob Jolliffe <bobjolliffe@gmail.com> >> wrote:

I've a problem which I'd welcome some input on. And which is at least
peripherally related to the mail Knut has just sent re case sensitive
name matching (MALE != Male).

Its about fixing the categoryoptioncombos and bringing in the notion
of concepts, which has already been discussed at length elsewhere and
is blueprinted and tagged here:

https://blueprints.launchpad.net/dhis2/+spec/extend-category-and-groupsets-model-with-concepts

Whereas implementing this is fairly straightforward I'm a bit unsure
how to manage the updating of existing databases which have already
built up an impressive array of aliases for things like '<5'.

One aim is to get rid of, or rather merge, categoryoptions like "<5",
" <5", "HIV_under5", "TB_under5" etc. Currently we know these things
are there because of the problem that categoryoptions must be unique
and cannot appear in more than one category, causing implementors to
invent various ways of saying under 5.

The proposal to fix this is to have a single categoryoption '<5' with
a concept AGE and various categories with concept AGE (ie various
lists of age groups, but drawing from the same pool of options with
concept AGE). This should work well and will cause no problems with
new databases and in the absence of any existing datavalues. But ....
the problem I see is that it is going to be quite difficult to make
these adjustments and mergers on a live database with existing
datavalues mapped to existing categorycombos. ie. repairing the
damage of the past 12 months.

Imagine I have 2 dataelements, de1 with category HIV_age which
includes the option HIV_under5, and de2 with category Malaria_age
which includes the categoryoption 'Malaria_under5'. Basically I want
to be able to merge "HIV_under5" with "Malaria_under5" to just create
"under5". Then we are back to some form of sanity. And worse, I need
users to be able to do this - it can't be doe automatically. The
problem is that to do this will require quite a significant amount of
background action - deleting existing categoryoptions, rebuilding new
categoryoptioncombo ids and mapping/updating the old to the new
categoryoptioncombo ids in the affected datavalues. Not a job for the
fainthearted. Do we have any background experience in such fiddling
with existing categoryoptions (other than manually)?

It seems that what we require here is some sort of categoryoptioncombo
merge functionality in the category service? Or is this long term
functionality we need in dhis at all? Is it better solved by an
external fixer-upper tool?

Bob

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp