Hi Devs,
I am working on the transformation of a DHIS2 database from one metadata base to another. One of the issues I have not really been able to figure out is how to deal with apparent duplicated category option combos. The database we are transforming to has for some reason, duplicated category option combos.By duplicated I mean they have the same category combinations , but different UIDs. I suspect one of these obviously is not real, but I do not really have a good idea about how to discriminate between the “real” one and the “bad” one.
are they also duplicates in the sense that they have the same set of category options?
If they are equal in terms of i) category combo and ii) set of category options and there are no data values, it does not matter which one you keep (just make sure you have no duplicates).
Hi Devs,
I am working on the transformation of a DHIS2 database from one metadata base to another. One of the issues I have not really been able to figure out is how to deal with apparent duplicated category option combos. The database we are transforming to has for some reason, duplicated category option combos.By duplicated I mean they have the same category combinations , but different UIDs. I suspect one of these obviously is not real, but I do not really have a good idea about how to discriminate between the “real” one and the “bad” one.
They are duplicates in the sense that they have the same category options and the same combination of categories. For instance, there would be two (Female, > 15 years) for the same category combination, but with different UIDs.
What I have been able to determine though is that ones which seem to need to be deleted do not show up in the resource table. I managed to get the others out of the categorycombos_optioncombos, categoryoptioncombos_categoryoptions, and categoryoptioncombo tables, and it seems to work. The “duplicates” are visible through the WebAPI, but their IDs are not present in the resource tables. I suspect these category combos have been altered at some point in time, but the categoryoptioncombos were never updated/deleted as they should have been.
I think it will matter which one we keep, as what is important to be able to sync with the master system is that we use the correct one.
Anyway, maybe that is a procedure? Delete anything from those tables which is not present in the resource tables? Fishing a bit really, but there must be some way to distinguish which ones of these are actually assigned to a particular data element?
Regards,
Jason
···
On Wed, Oct 9, 2013 at 11:44 AM, Lars Helge Øverland larshelge@gmail.com wrote:
Hi Jason,
are they also duplicates in the sense that they have the same set of category options?
If they are equal in terms of i) category combo and ii) set of category options and there are no data values, it does not matter which one you keep (just make sure you have no duplicates).
Hi Devs,
I am working on the transformation of a DHIS2 database from one metadata base to another. One of the issues I have not really been able to figure out is how to deal with apparent duplicated category option combos. The database we are transforming to has for some reason, duplicated category option combos.By duplicated I mean they have the same category combinations , but different UIDs. I suspect one of these obviously is not real, but I do not really have a good idea about how to discriminate between the “real” one and the “bad” one.
They are duplicates in the sense that they have the same category options
and the same combination of categories. For instance, there would be two
(Female, > 15 years) for the same category combination, but with different
UIDs.
What I have been able to determine though is that ones which seem to need
to be deleted do not show up in the resource table. I managed to get the
others out of
the categorycombos_optioncombos, categoryoptioncombos_categoryoptions,
and categoryoptioncombo tables, and it seems to work. The "duplicates" are
visible through the WebAPI, but their IDs are not present in the resource
tables. I suspect these category combos have been altered at some point in
time, but the categoryoptioncombos were never updated/deleted as they
should have been.
I think it will matter which one we keep, as what is important to be able
to sync with the master system is that we use the correct one.
Anyway, maybe that is a procedure? Delete anything from those tables which
is not present in the resource tables? Fishing a bit really, but there must
be some way to distinguish which ones of these are actually assigned to a
particular data element?
This sounds like the "correct" route. The java code which constructs the
_categorystructure table is probably the same code which is being used when
assigning to particular dataelements. So what the resource table thinks is
the correct set should be the correct set. Mind you Lars is also probably
right that if you delete the "true" one and regenerate the resource table
it might well be just as happy to use the duplicate instead. In which case
it might not be an issue unless (i) you have existing data using these or
(ii) the uids are being used anywhere, eg in formulae.
···
On 9 October 2013 11:51, Jason Pickering <jason.p.pickering@gmail.com>wrote:
Regards,
Jason
On Wed, Oct 9, 2013 at 11:44 AM, Lars Helge Øverland <larshelge@gmail.com>wrote:
Hi Jason,
are they also duplicates in the sense that they have the same set of
category options?
If they are equal in terms of i) category combo and ii) set of category
options and there are no data values, it does not matter which one you keep
(just make sure you have no duplicates).
cheers
Lars
On Wed, Oct 9, 2013 at 11:27 AM, Jason Pickering < >> jason.p.pickering@gmail.com> wrote:
Hi Devs,
I am working on the transformation of a DHIS2 database from one metadata
base to another. One of the issues I have not really been able to figure
out is how to deal with apparent duplicated category option combos. The
database we are transforming to has for some reason, duplicated category
option combos.By duplicated I mean they have the same category combinations
, but different UIDs. I suspect one of these obviously is not real, but I
do not really have a good idea about how to discriminate between the "real"
one and the "bad" one.
They are duplicates in the sense that they have the same category options and the same combination of categories. For instance, there would be two (Female, > 15 years) for the same category combination, but with different UIDs.
What I have been able to determine though is that ones which seem to need to be deleted do not show up in the resource table. I managed to get the others out of the categorycombos_optioncombos, categoryoptioncombos_categoryoptions, and categoryoptioncombo tables, and it seems to work. The “duplicates” are visible through the WebAPI, but their IDs are not present in the resource tables. I suspect these category combos have been altered at some point in time, but the categoryoptioncombos were never updated/deleted as they should have been.
I think it will matter which one we keep, as what is important to be able to sync with the master system is that we use the correct one.
Anyway, maybe that is a procedure? Delete anything from those tables which is not present in the resource tables? Fishing a bit really, but there must be some way to distinguish which ones of these are actually assigned to a particular data element?
This sounds like the “correct” route. The java code which constructs the _categorystructure table is probably the same code which is being used when assigning to particular dataelements. So what the resource table thinks is the correct set should be the correct set. Mind you Lars is also probably right that if you delete the “true” one and regenerate the resource table it might well be just as happy to use the duplicate instead. In which case it might not be an issue unless (i) you have existing data using these or (ii) the uids are being used anywhere, eg in formulae.
On Wed, Oct 9, 2013 at 11:44 AM, Lars Helge Øverland larshelge@gmail.com wrote:
Hi Jason,
are they also duplicates in the sense that they have the same set of category options?
If they are equal in terms of i) category combo and ii) set of category options and there are no data values, it does not matter which one you keep (just make sure you have no duplicates).
Hi Devs,
I am working on the transformation of a DHIS2 database from one metadata base to another. One of the issues I have not really been able to figure out is how to deal with apparent duplicated category option combos. The database we are transforming to has for some reason, duplicated category option combos.By duplicated I mean they have the same category combinations , but different UIDs. I suspect one of these obviously is not real, but I do not really have a good idea about how to discriminate between the “real” one and the “bad” one.