Delete Event records from psql database

Hi Folks,

Can anyone tell me or point me to documentation that explains the data structure for how tracker events are stored in the psql back end?

I have numerous event tracker based programs and some with a large volume of records (2M+) and during testing I regularly post large volumes of records via the webAPI in a json payload that need to be cleaned out after.

I have found that the trackedentitydatavlaue & programstageinstance tables seem to hold all the relevant data for specific/individual program event - is it safe to just delete the desired records from these tables (based on the correct programstageinstanceid?).

Thanks,
Kevin

2 Likes

Hey Kevin,
deleting and manipulating data directly in the database is not officially supported and documented.

That said, you seem to be on the track. Deleting the programInstance(event) and all related data in trackedentitydatavalye, trackedentitydatavalueaudit and programstageinstancecomments would remove the event from the database.

Needless to say, you should test any scripts to do this thoroughly and always have backups of your data.

Markus

3 Likes

Hi Markus,

Thanks for getting back to me.
I’m in the process of testing such scripts now on private test instance.

Is there a supported/documented process to achieve the same thing (bulk event deletions) via the API?

Best,
Kevin

1 Like

Hi @kgbolger,

You can achieve bulk event deletion using the api, yes. The documentation can be a bit confusing since we are talking about tracker and events both as the same and different things, but I will point you in the right direction:

First you need a payload containing all the events you want to delete. You can have a look here to see example payloads: https://docs.dhis2.org/master/en/developer/html/webapi_tracker_api.html#webapi_events

For the purpose of deleting events, the very minimum information needed in the payload is the following:

{
    events: [
        {
            "event": "<uid of event>"
        }
    ]
} 

The second part is using the correct api endpoint, which you can also find the documentation I linked to:

POST /api/events

Finally, the most important part is utilizing the importStrategy parameter. You can read more about them in the same documentation. The importStrategy you want to use, is “DELETE”.

A final note: Newer versions of DHIS2 soft-delete events, meaning they will be filtered out from the payloads in the api and not included in analytics data, but they will remain in the database. We do not support removing these data from the database though, so if you want to embark on the quest, you will need to deduct dependencies from the database tables yourself, since they are too many to easily sum up.

Hope this helps

1 Like

Fro the ones looking to delete all event, tei of a program

Create the StoreProcedure

CREATE  OR REPLACE  PROCEDURE usp_cleanProgramData( IN i_programId nchar(11))
language plpgsql
as $$
DECLARE  i_pid integer;
begin
SELECT  programid into i_pid from program WHERE uid = i_programId;
delete from trackedentityattributeattributevalues  where trackedentityattributeid in (select p.trackedentityinstanceid from programinstance p join trackedentityattributevalue tia on p.trackedentityinstanceid = tia.trackedentityinstanceid  where p.programid = i_pid);
delete from trackedentityattributevalueaudit where  trackedentityinstanceid  in (select p.trackedentityinstanceid from programinstance p where p.programid = i_pid);
delete from trackedentityattributevalue where  trackedentityinstanceid  in (select p.trackedentityinstanceid from programinstance p where p.programid = i_pid);
delete from programnotificationinstance where programinstanceid in(select p.programinstanceid from programinstance p where p.programid = i_pid);
delete from programstageinstancecomments where programstageinstanceid in (select psi.programstageinstanceid from  programstage ps join programstageinstance psi on ps.programstageid = psi.programstageid where ps.programid = i_pid);
delete from programstageinstance where programstageid in(select ps.programstageid from programstage ps  where ps.programid = i_pid);
delete from trackedentityinstance where trackedentityinstanceid not in (select trackedentityinstanceid from programinstance);
delete from programinstancecomments where programinstanceid in(select programinstanceid from programinstance where programid = i_pid);
delete from programinstance where programid = i_pid;
end; $$;

the run it with the program id

call usp_cleanProgramData('gzr94iNS4Bv');
1 Like