Can anyone tell me or point me to documentation that explains the data structure for how tracker events are stored in the psql back end?
I have numerous event tracker based programs and some with a large volume of records (2M+) and during testing I regularly post large volumes of records via the webAPI in a json payload that need to be cleaned out after.
I have found that the trackedentitydatavlaue & programstageinstance tables seem to hold all the relevant data for specific/individual program event - is it safe to just delete the desired records from these tables (based on the correct programstageinstanceid?).
Hey Kevin,
deleting and manipulating data directly in the database is not officially supported and documented.
That said, you seem to be on the track. Deleting the programInstance(event) and all related data in trackedentitydatavalye, trackedentitydatavalueaudit and programstageinstancecomments would remove the event from the database.
Needless to say, you should test any scripts to do this thoroughly and always have backups of your data.
You can achieve bulk event deletion using the api, yes. The documentation can be a bit confusing since we are talking about tracker and events both as the same and different things, but I will point you in the right direction:
For the purpose of deleting events, the very minimum information needed in the payload is the following:
{
events: [
{
"event": "<uid of event>"
}
]
}
The second part is using the correct api endpoint, which you can also find the documentation I linked to:
POST /api/events
Finally, the most important part is utilizing the importStrategy parameter. You can read more about them in the same documentation. The importStrategy you want to use, is “DELETE”.
A final note: Newer versions of DHIS2 soft-delete events, meaning they will be filtered out from the payloads in the api and not included in analytics data, but they will remain in the database. We do not support removing these data from the database though, so if you want to embark on the quest, you will need to deduct dependencies from the database tables yourself, since they are too many to easily sum up.