Can anyone tell me or point me to documentation that explains the data structure for how tracker events are stored in the psql back end?
I have numerous event tracker based programs and some with a large volume of records (2M+) and during testing I regularly post large volumes of records via the webAPI in a json payload that need to be cleaned out after.
I have found that the trackedentitydatavlaue & programstageinstance tables seem to hold all the relevant data for specific/individual program event - is it safe to just delete the desired records from these tables (based on the correct programstageinstanceid?).
Hey Kevin,
deleting and manipulating data directly in the database is not officially supported and documented.
That said, you seem to be on the track. Deleting the programInstance(event) and all related data in trackedentitydatavalye, trackedentitydatavalueaudit and programstageinstancecomments would remove the event from the database.
Needless to say, you should test any scripts to do this thoroughly and always have backups of your data.
You can achieve bulk event deletion using the api, yes. The documentation can be a bit confusing since we are talking about tracker and events both as the same and different things, but I will point you in the right direction:
First you need a payload containing all the events you want to delete. You can have a look here to see example payloads: Tracker - DHIS2 Documentation
For the purpose of deleting events, the very minimum information needed in the payload is the following:
{
events: [
{
"event": "<uid of event>"
}
]
}
The second part is using the correct api endpoint, which you can also find the documentation I linked to:
POST /api/events
Finally, the most important part is utilizing the importStrategy parameter. You can read more about them in the same documentation. The importStrategy you want to use, is “DELETE”.
A final note: Newer versions of DHIS2 soft-delete events, meaning they will be filtered out from the payloads in the api and not included in analytics data, but they will remain in the database. We do not support removing these data from the database though, so if you want to embark on the quest, you will need to deduct dependencies from the database tables yourself, since they are too many to easily sum up.
Fro the ones looking to delete all event, tei of a program
Create the StoreProcedure
CREATE OR REPLACE PROCEDURE usp_cleanProgramData( IN i_programId nchar(11))
language plpgsql
as $$
DECLARE i_pid integer;
begin
SELECT programid into i_pid from program WHERE uid = i_programId;
delete from trackedentityattributeattributevalues where trackedentityattributeid in (select p.trackedentityinstanceid from programinstance p join trackedentityattributevalue tia on p.trackedentityinstanceid = tia.trackedentityinstanceid where p.programid = i_pid);
delete from trackedentityattributevalueaudit where trackedentityinstanceid in (select p.trackedentityinstanceid from programinstance p where p.programid = i_pid);
delete from trackedentityattributevalue where trackedentityinstanceid in (select p.trackedentityinstanceid from programinstance p where p.programid = i_pid);
delete from programnotificationinstance where programinstanceid in(select p.programinstanceid from programinstance p where p.programid = i_pid);
delete from programstageinstancecomments where programstageinstanceid in (select psi.programstageinstanceid from programstage ps join programstageinstance psi on ps.programstageid = psi.programstageid where ps.programid = i_pid);
delete from programstageinstance where programstageid in(select ps.programstageid from programstage ps where ps.programid = i_pid);
delete from trackedentityinstance where trackedentityinstanceid not in (select trackedentityinstanceid from programinstance);
delete from programinstancecomments where programinstanceid in(select programinstanceid from programinstance where programid = i_pid);
delete from programinstance where programid = i_pid;
end; $$;