Restrictions on category combinations

Hi!

Is there an implicit restriction on overlapping categoryOptions between the categories in a categoryCombinations? I tried creating the category options ‘True’ and ‘False’, and two categories ‘Bool1’ and ‘Bool2’, both containing the category options true and false. When combining Bool1 and Bool2 in a categoryCombination I expected to get four categoryOptionCombinations:

  • True, True
  • True, False
  • False, True
  • False, False

Instead I got:

  • True, True
  • True, False, True, False
  • False, False
    (note: True, False, True, False only referred to the category options once, only the name had four entries)

Is this the intended behavior?

Br. August Matisen

Hi @augustsm

It appears that this is a known issue with a recommended fix in the documentation:

Duplicated category option combinations within a category combination.

Within each category combination, a unique set of category option combinations should exist. In certain circumstances, duplicate category option combinations may exist in the system. This usually results from changes to category combinations after they have been created, or direct manipulation of the various category tables in the database. This may result in certain data element/category option combinations not appearing or being unavailable in the data entry screens and/or analytics apps.

Severity: Severe

Recommendation: Duplicated category option combinations within a category combination will require you to merge category option combinations together. This will require direct manipulation of the database, and should always be conducted first in a testing environment. Only after you have thoroughly tested your procedure, and have confidence that it works, should you perform the procedure on your production environment. The DHIS2 implementation team has created a series of SQL functions to help you remove these duplicated COCs from your system.
(source)

Hi @Gassim

This does not seem to be the same bug. What I reported is not a case of duplicate category option combinations within a category combination, but rather overlapping category options between categories in a category combination. The resulting three category option combinations were perfectly unique, they just did not match my expectations, and I’m wondering if it is intentional or not.

Hi @augustsm
If I understand you correctly, you created two categories, each with category options “True” and “False”?

As noted in the docs, this type of design should not be used. Namely, you should not have any category options which are shared between categories within the same category combination. And as noted in the docs “This may result in certain data element/category option combinations not appearing or being unavailable in the data entry screens and/or analytics apps.”.

I am not sure what exactly you are trying to do, but normally this case happens when implementers use “Unknown” in one of their categeories. So you might have one category like “Gender” (Male, Female, Unknown) and Age (<15, 15+, Unknown). You could then create a category combination of Age + Gender. Now, what you should NOT do is to use the same “Unknown” in two categories which are part of the same category combination. You should instead use “Unknown gender” and “Unknown age” (two different category options).

Hope this helps to clarify.

Best regards,
Jason

Thanks @jason, this was what I wanted to know. Can you link to the place in the docs where it is mentioned?

On the question of what I’m trying to do, it relates to auto-import of data via the API. In order to do this reliably, I need to know how category option combinations work under the hood to make sure I reliably set the correct category option combination ID when calling the API. I have not found a comprehensive guide to this in the docs, so I have resorted to experimentation, which is why created the categories in question. Does such a guide exist?

Best regards, August

Edit: I should note that the end goal here is for our users to be able to create data sets outside DHIS2 and then import them into the solution for visualisation, sharing via API etc. Therefore I need to know what the limitations on that input data will be. Based on this conversation, it will not be possible for them to have multiple input columns with overlapping values like “True”, “False” or “Unknown”.