Seems like a good idea to me - this is certainly a serious problem. Hope it can be part of 2.21.
···
On Thu, Oct 15, 2015 at 5:33 AM, Wilson, Randy rwilson@msh.org wrote:
Hi all,
It seems to me from the pgadmin create table script below that the table trackedentitydatavalue table does not have a unique primary key constraint (programstageinstanceid+dataelementid). I theory adding that constraint would make it impossible to have duplicates.
here is the extra constraint needed:
CONSTRAINT pk_programstageinstanceid_dataelementid PRIMARY KEY (programstageinstanceid, dataelementid)
I hesitate to do it on our production instance because I wonder if the Java code should be checked to make sure the system will recover gracefully if a user tries to save the same record twice within a few seconds due to the latency issues discussed below.
Is that a change that can be tried by the developers and then moved into trunk?
current structure of table:
CREATE TABLE trackedentitydatavalue
(
programstageinstanceid integer NOT NULL,
dataelementid integer NOT NULL,
value character varying(255),
“timestamp” timestamp without time zone,
providedelsewhere boolean,
storedby character varying(31),
CONSTRAINT fk_entityinstancedatavalue_dataelementid FOREIGN KEY (dataelementid)
REFERENCES dataelement (dataelementid) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT fk_entityinstancedatavalue_programstageinstanceid FOREIGN KEY (programstageinstanceid)
REFERENCES programstageinstance (programstageinstanceid) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
ALTER TABLE trackedentitydatavalue
OWNER TO individualrecords;
Thanks,
Randy
This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately.
–
On Fri, Oct 9, 2015 at 6:35 PM, Knut Staring knutst@gmail.com wrote:
Seems like what we also experienced a lot for tLiberia, I think.
Knut
–
Randy Wilson
*Team Leader: *Knowledge Management, Data Use and Research
Rwanda Health System Strengthening Activity
Management Sciences for Health
Rwanda-Kigali
Direct: +250 788308835
E-mail: rwilson@msh.org
Skype: wilsonrandy_us

Stronger health systems. Greater health impact.

www.msh.org
On Fri, Oct 9, 2015 at 11:13 AM, Bob Jolliffe bobjolliffe@gmail.com wrote:
If you experience reasonably high latency (slow response times) it seems quite likely you might well get duplicates. Probably related to
https://bugs.launchpad.net/dhis2/+bug/1498505
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
Knut Staring
Dept. of Informatics, University of Oslo
Norway: +4791880522
Skype: knutstar
http://dhis2.org
–
On 9 October 2015 at 15:40, Wilson, Randy rwilson@msh.org wrote:
I will check but when i remove the duplicates analytics runs. Also usually we get many more duplicates 100 or more each time I check.
Randy
On Oct 9, 2015 3:48 PM, “Ant Snyman” ant@hisp.org wrote:
Hi Randy,
There is actually a constraint on the trackedentitydatavalue table and you cannot create two entries with the same programstageinstance and dataelementid. I notice that you are grouping on dataelement name, is it possible that you could have two dataelements with the same name in your dataelement table? I.e different dataelementids but the same name?
Regards
Ant
This message and any attachments are subject to a disclaimer published at http://www.hisp.org/policies.html#comms_disclaimer. Please read the disclaimer before opening any attachment or taking any other action in terms of this electronic transmission. If you cannot access the disclaimer, kindly send an email to disclaimer@hisp.org and a copy will be provided to you. By replying to this e-mail or opening any attachment you agree to be bound by the provisions of the disclaimer.
This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately.
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
On 9 October 2015 at 10:22, Wilson, Randy rwilson@msh.org wrote:
Hi all,
We have an big issue with several of Rwanda’s event reporting systems that have been set up using Tracker/Individual records. For some reason, users are able to create duplicate records in the system through normal data entry. It seems to happen when they enter a value for an attribute and then a couple of seconds or milliseconds later they change that value. The system then stores both records with last updated several milliseconds apart.
This is a big issue because analytics fails to run when it finds these duplicates.
Here is the query that finds the duplicates:
select programstageinstanceid, dataelementname, count(*) as numrecs from
(
SELECT
program.name AS programname,
program.description,
programinstance.dateofincident,
programinstance.trackedentityinstanceid,
programinstance.status,
programstageinstance.programstageinstanceid,
programstageinstance.programinstanceid,
programstageinstance.programstageid,
trackedentitydatavalue.value,
dataelement.name AS dataelementname
FROM
public.programstageinstance,
public.programinstance,
public.program,
public.trackedentitydatavalue,
public.dataelement
WHERE
programstageinstance.programinstanceid = programinstance.programinstanceid AND
program.programid = programinstance.programid AND
trackedentitydatavalue.programstageinstanceid = programstageinstance.programstageinstanceid AND
dataelement.dataelementid = trackedentitydatavalue.dataelementid
) as foo
group by programstageinstanceid, dataelementname
having count(*) >1
order by numrecs desc;
Here is the typical output of the query when there are duplicate records:
Programstageinstanceid, dataelementname,numrecs
777921;“TB Sample type_morning Sputum”;2
776886;“TB Date of collection_sample”;2
773132;“TB History of contact with TPB+ case”;2
Here is an example of the duplicates we see when we select the duplicated records from the trackedentitydatavalue table.
prograstageinstanceid, dataelementid,value,lastupdated,providedelsewhere,user
773132;35105;“true”;“2015-10-02 09:32:48.334”;FALSE;“abibarimana”
773132;35105;“true”;“2015-10-02 09:32:48.508”;FALSE;“abibarimana”
776886;31879;“2015-07-17”;“2015-10-07 16:25:26.341”;FALSE;“mugeni”
776886;31879;“2015-07-14”;“2015-10-07 16:25:26.546”;FALSE;“mugeni”
777921;35106;“true”;“2015-10-08 09:55:36.133”;FALSE;“emufasha”
777921;35106;“true”;“2015-10-08 09:55:36.278”;FALSE;“emufasha”
In order to clean up the trackedentitydatavalue table, I need to write these records to a temporary table, delete them from trackedentitydatavalue, then decide which of the duplicate pairs to delete before copying the correct ones back into trackedentitydatavalues.
This error seems to happen almost daily when the systems are in full use.
Is there a way we can put a constraint on the trackedentitydatavalue table so that it will not accept duplicate values for programstageinstanceid and dataelementid? I also think there must be an error in the java code that should be updating the first saved record with the one entered by the user later rather than inserting a new record.
This is a major bug because it causes analytics to fail and when it fails we cannot run any event reports to get data out of the system.
–
Randy Wilson
*Team Leader: *Knowledge Management, Data Use and Research
Rwanda Health System Strengthening Activity
Management Sciences for Health
Rwanda-Kigali
Direct: +250 788308835
E-mail: rwilson@msh.org
Skype: wilsonrandy_us

Stronger health systems. Greater health impact.

www.msh.org
This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately.
Mailing list: https://launchpad.net/~dhis2-users
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help : https://help.launchpad.net/ListHelp
–
Ant Snyman
Cell: 0824910449
Landline: 028 2713242
Health Information Systems Program - SA
Knut Staring
Dept. of Informatics, University of Oslo
Norway: +4791880522
Skype: knutstar
http://dhis2.org