Hello,
We have a very large DHIS2 aggregate instance with many datasets and as a result many COCs. There was a mistake made with one of our heavily used categories that has caused the system to generate ~100,000 “obsolete” COCs, and this is causing some other problems. I’m hoping someone can advise on the best way to remove these obsolete COCs from the system…
Details: Most category combinations in our system contain the Sex category. Recently, someone added a new category option to the sex Category, resulting in the creation of 100,000 COCs. In reality, this new category option was only needed for a subset of data elements, so we decided a more efficient approach would be to create a new category that can contain this new category option, and remove that option from the Sex category. As a result, the 100,000 COCs that were created are now “obsolete” (they don’t belong to any CC that still exists in the system).
Ever since doing this, we have been unable to run “Update category option combinations”. Looking at the logs, its a timeout issue. When we run the update it is trying to delete the 100,000 COCs, but its unable to do so before the system times out. However, based on the logs, we’ve calculated and it would take about 14 days to complete successfully. We cannot logically increase our timeout to 14 days.
We’ve tried various approaches to remove these obsolete COCs, including a script that deletes them. However, to delete all COCs the script would take about 3 months… I was wondering if anyone has any experience with something of this magnitude and any advice on how to approach it.
Hi @kstankevitz
I think in this case, the most efficient way is with an SQL query. I believe that this SQL statement should help to identify all instances of these so called “disjoint category option combinations”.
You will need to modify that statement a bit to actually delete the COCs, but I would start by first confirming that you can actually identify them.
Hope this helps and let us know if you require further assistance.
Hello @jason, thanks for taking time to respond to @kstankevitz. Also for further researchers. I used below to get the aocs decommissioned with no data-value. This now allow for decision making to isolate for delete.
select dv.*, ccb."name" from datavalue dv
inner join categoryoptioncombo coc on coc.categoryoptioncomboid = dv.attributeoptioncomboid
inner join categoryoptioncombos_categoryoptions c on c.categoryoptioncomboid = coc.categoryoptioncomboid
inner join dataelementcategoryoption d on d.categoryoptionid = c.categoryoptionid
inner join categories_categoryoptions e on e.categoryoptionid = d.categoryoptionid
inner join dataelementcategory dt on dt.categoryid = e.categoryid
inner join categorycombos_categories b on b.categoryid = dt.categoryid
inner join categorycombo ccb on ccb.categorycomboid=b.categorycomboid
inner join dataset ds on ds.categorycomboid = ccb.categorycomboid
where dt.datadimensiontype = 'ATTRIBUTE' and ds.datasetid is null