How disaggregation impacts data

dmbantu · 28 October 2021 10:43

Hi everyone,

By reading DHIS 2 documentation, I understand that we can disaggregate data by creating category options – categories – category combination or category options – category option groups – category group sets

I am wondering if anyone could provide detailed explanation about these two approaches. And the impact they have on data analysis. I will be happy if I get to understand the difference

Thanks

jason · 2 November 2021 09:31

Hi @dmbantu ,
Category options/categories and category combinations are used during data collection to disaggregate data. They can also be used in the pivot tables and other analytical apps, to aggregate data to higher levels. For instance, you might have two categories “Sex” and “Age” as part of a category combination. The “Sex” category might have category options like “Male” and “Female”, while the “Age” category might have options like <1, 1-5,10-14,15-25,25+. Using the pivot tables, you could aggregate the data to only display the data by “Male” and “Female”. In this case, all of the “Age” category options would be aggregated together corresponding to whether they belong to the “Male” or “Female” category.

Category option groups and group sets can be used in the analytical apps to reaggregate data in different ways. You might create two groups called “<15” and “15+”. In these groups, you could then assign all age groups which are <15 to that group, and 15+ to another group. This would allow you to (from the example in the previous paragraph) to reaggregate the fine age bands to more coarse age bands. Category option groups and sets can be used to allow the comparison of data which have related by perhaps not exactly the same category option groups.

So, in summary, categories/category options/category combinations are used to collect data, while category option groups and sets can be used to reaggregate the data in different ways then it was collected.

Hope this helps!

dmbantu · 2 November 2021 10:23

Hi @jason,

Your explanation is very clear. I now see the difference, but I didn’t understand this part:

I mean these terms fine age bands and coarse age bands.
Thanks

jason · 2 November 2021 12:13

Hi again @dmbantu .

What I mean is that the “fine age bands” might correspond to the age groups <1, 1-5,10-14,15-25,25+.

The “Coarse age bands” might correspond to just <15 and 15+.

Using the category option group set approach, you can reaggregate the data in the analytical applications to based on new groups. So, in the example we have been using, the category options could be grouped as follows:

<15 : <1, 1-5,10-14
15+ : 15-25,25+.

When using a pivot table or chart, you could choose to reaggregate the data according to these groups instead of how the data was collected.

One of the most common applications of category option groups is when you wish to combine data which has been collected according to differing category options. These category options might differ between data elements, or perhaps they have changed over time. Category option groups and sets offers you a way to combine data at higher levels of aggregation where the data can become comparable.

Hope that helps to clarify a bit better!

dmbantu · 2 November 2021 13:15

Hi @jason,

This is a really detailed explanation. It has dispelled any doubts I had about the difference between the two approaches.

Thank you very much.