Error on analytics tables update

Hi Devs,

This could still be under development, not really sure, but just starting to take a look at the analytics stuff and ran into this.

Caused by: org.springframework.dao.DataIntegrityViolationException: StatementCallback; SQL [insert into analytics_temp_2010 (DdOZ3oW8Gu6,uidlevel1,uidlevel2,uidlevel3,uidlevel4,uidlevel5,daily,weekly,monthly,bimonthly,quarterly,sixmonthly,yearly,de,co,level,daysxvalue, daysno, value) select degs.DdOZ3oW8Gu6,ous.uidlevel1,ous.uidlevel2,ous.uidlevel3,ous.uidlevel4,ous.uidlevel5,ps.daily,ps.weekly,ps.monthly,ps.bimonthly,ps.quarterly,ps.sixmonthly,ps.yearly,de.uid,co.uid,ous.level,cast(dv.value as double precision) * ps.daysno as daysxvalue, ps.daysno as daysno, cast(dv.value as double precision) as value from datavalue dv left join _dataelementgroupsetstructure degs on dv.dataelementid=degs.dataelementid left join _organisationunitgroupsetstructure ougs on dv.sourceid=ougs.organisationunitid left join _orgunitstructure ous on dv.sourceid=ous.organisationunitid left join _periodstructure ps on dv.periodid=ps.periodid left join dataelement de on dv.dataelementid=de.dataelementid left join categoryoptioncombo co on dv.categoryoptioncomboid=co.categoryoptioncomboid left join period pe on dv.periodid=pe.periodid where de.valuetype=‘int’ and pe.startdate >= ‘2010-01-01’ and pe.startdate <= '2010-12-31’and dv.value is not null and dv.value != ‘’]; ERROR: invalid input syntax for type double precision: “true”; nested exception is org.postgresql.util.PSQLException: ERROR: invalid input syntax for type double precision: “true”

In this case, it looks like we have some invalid data in the database and the SELECT query does not seem to work as intended.

Use of regular expressions may be required in order to filter out the invalid data

such as

SELECT … from datavalue … and dv.value ~(‘^(0|[1-9][0-9]*)$’)

This is obviously Postgresql syntax, and will only match valid integers.

Of course, the underlying problem is bad data in the database, but not sure if we need to cater to that here.

Regards,

Jason

Hi Jason,

yes you are correct. The issue here is probably that a value “true” has been entered for a data element which has later changed its type from boolean to number.

It needs to be handled in some way because this is likely to happen again. The task at hand is to create a bullet proof set of regex for identifying a numeric value for all supported dbms.

Lars

···

On Wed, Mar 6, 2013 at 6:22 AM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Devs,

This could still be under development, not really sure, but just starting to take a look at the analytics stuff and ran into this.

Caused by: org.springframework.dao.DataIntegrityViolationException: StatementCallback; SQL [insert into analytics_temp_2010 (DdOZ3oW8Gu6,uidlevel1,uidlevel2,uidlevel3,uidlevel4,uidlevel5,daily,weekly,monthly,bimonthly,quarterly,sixmonthly,yearly,de,co,level,daysxvalue, daysno, value) select degs.DdOZ3oW8Gu6,ous.uidlevel1,ous.uidlevel2,ous.uidlevel3,ous.uidlevel4,ous.uidlevel5,ps.daily,ps.weekly,ps.monthly,ps.bimonthly,ps.quarterly,ps.sixmonthly,ps.yearly,de.uid,co.uid,ous.level,cast(dv.value as double precision) * ps.daysno as daysxvalue, ps.daysno as daysno, cast(dv.value as double precision) as value from datavalue dv left join _dataelementgroupsetstructure degs on dv.dataelementid=degs.dataelementid left join _organisationunitgroupsetstructure ougs on dv.sourceid=ougs.organisationunitid left join _orgunitstructure ous on dv.sourceid=ous.organisationunitid left join _periodstructure ps on dv.periodid=ps.periodid left join dataelement de on dv.dataelementid=de.dataelementid left join categoryoptioncombo co on dv.categoryoptioncomboid=co.categoryoptioncomboid left join period pe on dv.periodid=pe.periodid where de.valuetype=‘int’ and pe.startdate >= ‘2010-01-01’ and pe.startdate <= '2010-12-31’and dv.value is not null and dv.value != ‘’]; ERROR: invalid input syntax for type double precision: “true”; nested exception is org.postgresql.util.PSQLException: ERROR: invalid input syntax for type double precision: “true”

In this case, it looks like we have some invalid data in the database and the SELECT query does not seem to work as intended.

Use of regular expressions may be required in order to filter out the invalid data

such as

SELECT … from datavalue … and dv.value ~(‘^(0|[1-9][0-9]*)$’)

This is obviously Postgresql syntax, and will only match valid integers.

Of course, the underlying problem is bad data in the database, but not sure if we need to cater to that here.

Regards,

Jason


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Maybe we could/should consider adding these to the data integrity checks? Might be a better place. If the data was valid to begin with then we would not need this regex to filter it out during the analytics stage, which should speed things up further.

Regards,

Jason

···

Sent from my mobile

On Mar 7, 2013 11:00 AM, “Lars Helge Øverland” larshelge@gmail.com wrote:

Hi Jason,

yes you are correct. The issue here is probably that a value “true” has been entered for a data element which has later changed its type from boolean to number.

It needs to be handled in some way because this is likely to happen again. The task at hand is to create a bullet proof set of regex for identifying a numeric value for all supported dbms.

Lars

On Wed, Mar 6, 2013 at 6:22 AM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Devs,

This could still be under development, not really sure, but just starting to take a look at the analytics stuff and ran into this.

Caused by: org.springframework.dao.DataIntegrityViolationException: StatementCallback; SQL [insert into analytics_temp_2010 (DdOZ3oW8Gu6,uidlevel1,uidlevel2,uidlevel3,uidlevel4,uidlevel5,daily,weekly,monthly,bimonthly,quarterly,sixmonthly,yearly,de,co,level,daysxvalue, daysno, value) select degs.DdOZ3oW8Gu6,ous.uidlevel1,ous.uidlevel2,ous.uidlevel3,ous.uidlevel4,ous.uidlevel5,ps.daily,ps.weekly,ps.monthly,ps.bimonthly,ps.quarterly,ps.sixmonthly,ps.yearly,de.uid,co.uid,ous.level,cast(dv.value as double precision) * ps.daysno as daysxvalue, ps.daysno as daysno, cast(dv.value as double precision) as value from datavalue dv left join _dataelementgroupsetstructure degs on dv.dataelementid=degs.dataelementid left join _organisationunitgroupsetstructure ougs on dv.sourceid=ougs.organisationunitid left join _orgunitstructure ous on dv.sourceid=ous.organisationunitid left join _periodstructure ps on dv.periodid=ps.periodid left join dataelement de on dv.dataelementid=de.dataelementid left join categoryoptioncombo co on dv.categoryoptioncomboid=co.categoryoptioncomboid left join period pe on dv.periodid=pe.periodid where de.valuetype=‘int’ and pe.startdate >= ‘2010-01-01’ and pe.startdate <= '2010-12-31’and dv.value is not null and dv.value != ‘’]; ERROR: invalid input syntax for type double precision: “true”; nested exception is org.postgresql.util.PSQLException: ERROR: invalid input syntax for type double precision: “true”

In this case, it looks like we have some invalid data in the database and the SELECT query does not seem to work as intended.

Use of regular expressions may be required in order to filter out the invalid data

such as

SELECT … from datavalue … and dv.value ~(‘^(0|[1-9][0-9]*)$’)

This is obviously Postgresql syntax, and will only match valid integers.

Of course, the underlying problem is bad data in the database, but not sure if we need to cater to that here.

Regards,

Jason


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Yes that could be an option. But i think we would still need regex to identify bad data as pulling all data records into java is problematic…

···

On Mar 8, 2013 9:06 AM, “Jason Pickering” jason.p.pickering@gmail.com wrote:

Maybe we could/should consider adding these to the data integrity checks? Might be a better place. If the data was valid to begin with then we would not need this regex to filter it out during the analytics stage, which should speed things up further.

Regards,

Jason

Sent from my mobile

On Mar 7, 2013 11:00 AM, “Lars Helge Øverland” larshelge@gmail.com wrote:

Hi Jason,

yes you are correct. The issue here is probably that a value “true” has been entered for a data element which has later changed its type from boolean to number.

It needs to be handled in some way because this is likely to happen again. The task at hand is to create a bullet proof set of regex for identifying a numeric value for all supported dbms.

Lars

On Wed, Mar 6, 2013 at 6:22 AM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Devs,

This could still be under development, not really sure, but just starting to take a look at the analytics stuff and ran into this.

Caused by: org.springframework.dao.DataIntegrityViolationException: StatementCallback; SQL [insert into analytics_temp_2010 (DdOZ3oW8Gu6,uidlevel1,uidlevel2,uidlevel3,uidlevel4,uidlevel5,daily,weekly,monthly,bimonthly,quarterly,sixmonthly,yearly,de,co,level,daysxvalue, daysno, value) select degs.DdOZ3oW8Gu6,ous.uidlevel1,ous.uidlevel2,ous.uidlevel3,ous.uidlevel4,ous.uidlevel5,ps.daily,ps.weekly,ps.monthly,ps.bimonthly,ps.quarterly,ps.sixmonthly,ps.yearly,de.uid,co.uid,ous.level,cast(dv.value as double precision) * ps.daysno as daysxvalue, ps.daysno as daysno, cast(dv.value as double precision) as value from datavalue dv left join _dataelementgroupsetstructure degs on dv.dataelementid=degs.dataelementid left join _organisationunitgroupsetstructure ougs on dv.sourceid=ougs.organisationunitid left join _orgunitstructure ous on dv.sourceid=ous.organisationunitid left join _periodstructure ps on dv.periodid=ps.periodid left join dataelement de on dv.dataelementid=de.dataelementid left join categoryoptioncombo co on dv.categoryoptioncomboid=co.categoryoptioncomboid left join period pe on dv.periodid=pe.periodid where de.valuetype=‘int’ and pe.startdate >= ‘2010-01-01’ and pe.startdate <= '2010-12-31’and dv.value is not null and dv.value != ‘’]; ERROR: invalid input syntax for type double precision: “true”; nested exception is org.postgresql.util.PSQLException: ERROR: invalid input syntax for type double precision: “true”

In this case, it looks like we have some invalid data in the database and the SELECT query does not seem to work as intended.

Use of regular expressions may be required in order to filter out the invalid data

such as

SELECT … from datavalue … and dv.value ~(‘^(0|[1-9][0-9]*)$’)

This is obviously Postgresql syntax, and will only match valid integers.

Of course, the underlying problem is bad data in the database, but not sure if we need to cater to that here.

Regards,

Jason


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp