Retrieving Data of type value:- Date , text and Yes or No in DHIS2

Hi Felix,

It is still unclear to me. If you have a combination of yes/no, custom text values you definitely need an advanced query pointing each criteria type or create indicator that brings this all in one possible value. If you want simply report these exact data just read them and print, should be fairly easy.

best,

murod

···

On Fri, Nov 29, 2013 at 10:05 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Murod,

Ok, let me explain myself…

Assume i am collecting data with regard to facility Assessment and my data elements

  1. “Lighting and ventilation is observed to be adequate” (YES/NO)
  1. “General state of facility” (Option Set) --i) Poor ii) Fair iii) Average
  1. “Additional notes related to facility” (string)
  1. Does the facility have Handwashing Faility (YES/NO) * This only mean we want to make sure each facility has a handwashing facility

This will be done by a District Public Health Officer. You can aggregate this about the facility. I can actually create a form for the DPHO to enter the data for his monthly facility assessment for each facility. The chief DPHO wants a report about each facility. The data is stored into the value table.

Value Count is a good idea but i don’t want a count but the actual value for analysis

Cheers

I don’t need a count but a

On Fri, Nov 29, 2013 at 7:33 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Felix,

I do not understand for large on what you mean with “raw data” but can say if you are not looking for aggregated data for the sake of SQL requirements, just use count(1 as nothing) or count(value as value) avoiding type cast issues.

best,

murod

Obare Felix
Business Systems Director

SoftCall Communication |Software Engineering & Development| WHMCS Automation | Bulk SMS | Shortcodes Systems|

General Conference Building Riverside Drive off

Chiromo Road Nxt to Australian High Commission :Tel

254 20 3520010|Fax: 254 20 4440969 Cell: 254 733 730035 |254 720 730035

www.softcall.co.ke | fobareh@softcall.co.ke Yahoo Chat:

felondri@yahoo.com| Gchat:fobareh@softcall.co.ke| Skype: softcallcom | Twitter: @smartsms_ke

On Fri, Nov 29, 2013 at 9:19 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Thank you people for you input…

If i can clarify that the query is to pick raw data that otherwise is recorded but we cannot retrieve because its not aggregated. This data would include the following

Here is my full query:-

select county.name as county,subcounty.name as subcounty,division.name as division,

sublocation.name as sublocation, ou.name as facility,triggerstate.value as triggerstate

from organisationunit ou

left outer join _orgunitstructure ous

on (ou.organisationunitid=ous.organisationunitid)

left outer join organisationunit county

on (ous.idlevel2=county.organisationunitid)

left outer join organisationunit subcounty

on (ous.idlevel3=subcounty.organisationunitid)

left outer join organisationunit division

on (ous.idlevel4= division.organisationunitid)

left outer join organisationunit location

on (ous.idlevel5=location.organisationunitid)

left outer join organisationunit Sublocation

on (ous.idlevel6=sublocation.organisationunitid)

left outer join (

select sourceid, sum(cast(value as double precision)) as value

from datavalue

where dataelementid=8305

group by sourceid) as triggerstate on triggerstate.sourceid=ou.organisationunitid

where ous.level=7

========

ERROR: invalid input syntax for type double precision: “false”

********** Error **********

ERROR: invalid input syntax for type double precision: “false”

SQL state: 22P02

======

I want to just get the list of facilities trigger states Trues/False —I dont need aggregated data because the other elements i would want to include are date, i have an element with option type (with text) also stored into value table. I need to retrieve that and create a report.

I can’t use the select sourceid, sum(cast(value as double precision)) as value — To get the raw data

A good test is to use the Siera Leone demo and create a query to display the facility assessment raw data.

  1. “General state of facility”
  1. “Lighting and ventilation is observed to be adequate”

Cheers

On Fri, Nov 29, 2013 at 6:25 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,
Just trying to keep it focused on the original topic, but it seems you want to persist. The initial part of this post was about getting out text data (read the title). In this case, there is no need for any casting or aggregation. Period.

That regex you post again unfortunately will also not work in many cases , especially if you have integers stored are decimals or negative integers which might be perfectly valid. Please refer to the source code for a more appropriate one. The data which I mention is not “dummy data”. We have lots of problems with this type of data here in Zambia as well as other countries, because most of our data is submitted through mobiles. The trunk mobile clients do not have any validation, so getting data like “09” and “9.0” for an integer is quite common. We have added some regex validation to the J2ME client to try and prevent most of this junk for getting in, but with the new Light and Smartphone clients, it does not always work. Normally, we run other scripts to flag these values for “follow-up” automatically, to try and resolve what the number is, but it can take time.

In your case, you might think that this regex suits your needs, and would want to reject all decimals and negative numbers. That is up to you. But in our case, we attempt to salvage as much data as possible. Whether that is the right approach or not, well, maybe we can start a new thread for this. :slight_smile:

Best regards,

Jason

On Fri, Nov 29, 2013 at 5:13 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

I thought my points were of some help to solve the issue and discuss those appearing from initial issue. What you thought about first call is not right and what I proposed was a solution. After you added dummy data being stored, where new topic was introduced (off initial topic). And now you are saying this kind of bureaucratic on topic/off topic things.

With regards to regexp you are not bound with what I demonstrated, you can with the same ease, please use " and dv.value~E’[1][0-9]?$'" instead if the case is so worth, no probs.

best,

murod

On Fri, Nov 29, 2013 at 7:58 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,

I do not think anyone is happy about it, and thus my blueprint to provide server-side regex validation of data, which has yet to be implemented. However, the fact of the matter is, it can happen now, and having some data is usually better than having no data at all. That regex which you mention there will also not work in all cases, because things like “00” are quite common and cannot be cast, thus the need for the “heavier” regex which is used by the analytics process. Such values would pass the regex because it is all digits, but will fail the cast. You may not be happy about it, but it is usually necessary to be sure the values which should be numeric, can actually be cast to an integer.

Again, seemingly off-topic from the original post?

Regards,

Jason

On Fri, Nov 29, 2013 at 4:51 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

By telling these you mean data stored in DHIS2 at some implementations is not trusted, that is not cleaned enough. How can one use these data for analysis? Does it bring TRUE result for decision making? I don’t think such data should exist. At least some Data management action could clean data against data element declarations and value stored. After, there shouldn’t be such worries.

You can use regexp within SQL command to:

select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast (value as int)) as val, dv.sourceid as orgunit, aggregationtype from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

WHERE aggregationtype = ‘sum’ and domaintype=‘aggregate’ and dv.value~E’^\d+$’

group by dv.dataelementid, dv.sourceid, categoryoptioncomboid, periodid, aggregationtype

Please note this addition and dv.value~E’^\d+$', which makes SQL run failure safe.

But in overall am not happy with such data being stored without cleansing after import.

best,

murod

On Fri, Nov 29, 2013 at 7:10 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,
You would think this would work, and it might for you. The problem is there is no way to “trust” that the raw data values can be cast. We have had a lot of problems with this, but mostly because of 1) Legacy data before the different value types were implemented and 2) data which is received from mobile clients, which do not have the same restrictions as from the front-end. So, by relying on the aggregationtype, you must really know your data is clean, otherwise, it only takes one bad value for the aggregation with SQL not to work. You can see the way the analytics gets around this, by applying a regex. Whether it is “heavy” or not, I guess really depends on your reference frame.

However, getting back to the top of the thread, I thought Felix’s original issue was not being able to data which was stored as a date or text. My original point was that the casting and summing of this type of data is simply not possible with these simple SQL statements. But maybe I misunderstood Felx’s original issue.

Regards,

Jason

On Fri, Nov 29, 2013 at 3:15 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

Yes, it holds true in case if raw data being called. But query was with aggregation elements. Technique I proposed will exactly remove none convertible values from the query, thus there won’t be cast exception. No need to deploy heavy regex or stored procedure here, this is doable. Sample code here:

select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast (value as int)) as val, dv.sourceid as orgunit, aggregationtype from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

WHERE aggregationtype = ‘sum’ and domaintype=‘aggregate’

group by dv.dataelementid, dv.sourceid, categoryoptioncomboid, periodid, aggregationtype

This should give an idea on how to manage avoiding these exceptions just with plain SQL.

regards,

murod

On Fri, Nov 29, 2013 at 5:46 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Well, it is very hard to see what is going on without the entire query. My point here is there is no way to aggregate text, without some sort of procedure, so you cannot cast anything, nor can you sum it, because in the datavalue table, everything is stored as text. Just like with the analytics, you must first filter out all values which you think can be cast to a double with a regex, and then perform the sum/average/other aggregation operator. If you just want to get the raw data values back, then there should be no aggregation anyway, thus, no need for the sum operator.

Regards,
Jason

On Fri, Nov 29, 2013 at 2:39 PM, Murod Latifov mlatifov@gmail.com wrote:

I am afraid Jason, it won’t work that way. In Postgres aggregating (SUM, AVG) for varchar will give exception. There is no onbuilt type casting at this point.

On Fri, Nov 29, 2013 at 5:09 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

You cannot cast text to a double so just remove that cast operator and you should be fine.

–Sent from my mobile

On Nov 29, 2013 1:16 PM, “Felix Obareh” fobareh@softcall.co.ke wrote:

Thank Jason,
I have decided to go with the SQL Query. I am using the SQL Query example i the user manual together with the demo database for Sierra Leone.

It is working fine for elements of integer type and i can run the query from my pgAdmin and i get them fine. I have a problem with displaying data elements of text and varchar type.

Here is what am running as querry:-

------Trancated--------

left outer join organisationunit Sublocation

on (ous.idlevel6=sublocation.organisationunitid)

left outer join (

select sourceid, sum(cast(value as double precision)) as value

from datavalue

where dataelementid=8305

group by sourceid) as triggerstate on triggerstate.sourceid=ou.organisationunitid

where ous.level=7

****If i run the same query for a data element of date of text type or yes/no

***NB the triggerstate store a true or false value in the the datavalue table

HERE is the SQL error i get

ERROR: invalid input syntax for type double precision: “false”

********** Error **********

ERROR: invalid input syntax for type double precision: “false”

SQL state: 22P02


Please assist


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

On Wed, Nov 27, 2013 at 5:44 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Obare,
The reason for this is because you will only be able to see the aggregated data in DHIS2. In your case, it does not seem you need the aggregated data, but instead the raw data. One way to do this is with a custom report, which would use a JDBC data source, and then an SQL query to retreive the data. You might need some paramaters like the period and the particular orgunit. You can read about custom reports in the DHIS2 documentation. Another possible solution would be an SQL Query, which would allow you to pull out the raw data into a CSV file for subsequent analysis.

Hope this helps.

Regards,

Jason

On Wed, Nov 27, 2013 at 2:34 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Hi,

I a form similar to the Facility Assessment Form in the demo.dhis2.org . I have done some data entry but i don’t know how to get the report for that form

Among my Elements are:-

  1. Date Facility was commissioned
  1. Facility state on Commissioning with options set of ( Ready, Lacking Some Departments, Not ready)
  1. Date of Certification
  1. Date of Verification
  1. Source of Funds with option sets (Government, Donor)
  1. Is facility well maintained (Yes/No)
  1. Key issues identified & suggested solutions (Text)
  1. Number of Practitioners in the Facility

I can the above data on the entry form by i cant create a single report about the facility. The pivot tables on show the value of element 8. The value of the number of practitioners.

Any assistance will be highly appreciated. I have three more Facility based forms which i want to get reports for across the country

Cheers


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp


  1. 1-9 ↩︎

Murod,
The issue here is that these values (text, etc) are not visible through data set reports, as they are not able to be aggregated. So Felix needs a way to get the “raw data” out, without any aggregation. Perhaps I am as well totally off-topic as well, but this was the original request as I understood it. The above query, will simply return the contents of the data value table, with a bit basic information. Likely, it would need to be modified with more fields, and pulled into a Jasper report, along with some paramaters (such as the period and orgunit of interest) along with filtering for specific data elements, which might be present in a data set. We have implemented similar types of reports, but of course, it is possible do to the same thing through the data entry screens now, when you can limit who has access to alter the data. But creating a report or pulling this “raw data” into Excel may also be needed in certain cases. This is the purpose of this query.

Regards,

Jason

···

On Fri, Nov 29, 2013 at 7:39 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Felix,

It is still unclear to me. If you have a combination of yes/no, custom text values you definitely need an advanced query pointing each criteria type or create indicator that brings this all in one possible value. If you want simply report these exact data just read them and print, should be fairly easy.

best,

murod

On Fri, Nov 29, 2013 at 10:05 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Murod,

Ok, let me explain myself…

Assume i am collecting data with regard to facility Assessment and my data elements

  1. “Lighting and ventilation is observed to be adequate” (YES/NO)
  1. “General state of facility” (Option Set) --i) Poor ii) Fair iii) Average
  1. “Additional notes related to facility” (string)
  1. Does the facility have Handwashing Faility (YES/NO) * This only mean we want to make sure each facility has a handwashing facility

This will be done by a District Public Health Officer. You can aggregate this about the facility. I can actually create a form for the DPHO to enter the data for his monthly facility assessment for each facility. The chief DPHO wants a report about each facility. The data is stored into the value table.

Value Count is a good idea but i don’t want a count but the actual value for analysis

Cheers

I don’t need a count but a

On Fri, Nov 29, 2013 at 7:33 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Felix,

I do not understand for large on what you mean with “raw data” but can say if you are not looking for aggregated data for the sake of SQL requirements, just use count(1 as nothing) or count(value as value) avoiding type cast issues.

best,

murod

Obare Felix
Business Systems Director

SoftCall Communication |Software Engineering & Development| WHMCS Automation | Bulk SMS | Shortcodes Systems|

General Conference Building Riverside Drive off

Chiromo Road Nxt to Australian High Commission :Tel

254 20 3520010|Fax: 254 20 4440969 Cell: 254 733 730035 |254 720 730035

www.softcall.co.ke | fobareh@softcall.co.ke Yahoo Chat:

felondri@yahoo.com| Gchat:fobareh@softcall.co.ke| Skype: softcallcom | Twitter: @smartsms_ke

On Fri, Nov 29, 2013 at 9:19 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Thank you people for you input…

If i can clarify that the query is to pick raw data that otherwise is recorded but we cannot retrieve because its not aggregated. This data would include the following

Here is my full query:-

select county.name as county,subcounty.name as subcounty,division.name as division,

sublocation.name as sublocation, ou.name as facility,triggerstate.value as triggerstate

from organisationunit ou

left outer join _orgunitstructure ous

on (ou.organisationunitid=ous.organisationunitid)

left outer join organisationunit county

on (ous.idlevel2=county.organisationunitid)

left outer join organisationunit subcounty

on (ous.idlevel3=subcounty.organisationunitid)

left outer join organisationunit division

on (ous.idlevel4= division.organisationunitid)

left outer join organisationunit location

on (ous.idlevel5=location.organisationunitid)

left outer join organisationunit Sublocation

on (ous.idlevel6=sublocation.organisationunitid)

left outer join (

select sourceid, sum(cast(value as double precision)) as value

from datavalue

where dataelementid=8305

group by sourceid) as triggerstate on triggerstate.sourceid=ou.organisationunitid

where ous.level=7

========

ERROR: invalid input syntax for type double precision: “false”

********** Error **********

ERROR: invalid input syntax for type double precision: “false”

SQL state: 22P02

======

I want to just get the list of facilities trigger states Trues/False —I dont need aggregated data because the other elements i would want to include are date, i have an element with option type (with text) also stored into value table. I need to retrieve that and create a report.

I can’t use the select sourceid, sum(cast(value as double precision)) as value — To get the raw data

A good test is to use the Siera Leone demo and create a query to display the facility assessment raw data.

  1. “General state of facility”
  1. “Lighting and ventilation is observed to be adequate”

Cheers

On Fri, Nov 29, 2013 at 6:25 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,
Just trying to keep it focused on the original topic, but it seems you want to persist. The initial part of this post was about getting out text data (read the title). In this case, there is no need for any casting or aggregation. Period.

That regex you post again unfortunately will also not work in many cases , especially if you have integers stored are decimals or negative integers which might be perfectly valid. Please refer to the source code for a more appropriate one. The data which I mention is not “dummy data”. We have lots of problems with this type of data here in Zambia as well as other countries, because most of our data is submitted through mobiles. The trunk mobile clients do not have any validation, so getting data like “09” and “9.0” for an integer is quite common. We have added some regex validation to the J2ME client to try and prevent most of this junk for getting in, but with the new Light and Smartphone clients, it does not always work. Normally, we run other scripts to flag these values for “follow-up” automatically, to try and resolve what the number is, but it can take time.

In your case, you might think that this regex suits your needs, and would want to reject all decimals and negative numbers. That is up to you. But in our case, we attempt to salvage as much data as possible. Whether that is the right approach or not, well, maybe we can start a new thread for this. :slight_smile:

Best regards,

Jason

On Fri, Nov 29, 2013 at 5:13 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

I thought my points were of some help to solve the issue and discuss those appearing from initial issue. What you thought about first call is not right and what I proposed was a solution. After you added dummy data being stored, where new topic was introduced (off initial topic). And now you are saying this kind of bureaucratic on topic/off topic things.

With regards to regexp you are not bound with what I demonstrated, you can with the same ease, please use " and dv.value~E’[1][0-9]?$'" instead if the case is so worth, no probs.

best,

murod

On Fri, Nov 29, 2013 at 7:58 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,

I do not think anyone is happy about it, and thus my blueprint to provide server-side regex validation of data, which has yet to be implemented. However, the fact of the matter is, it can happen now, and having some data is usually better than having no data at all. That regex which you mention there will also not work in all cases, because things like “00” are quite common and cannot be cast, thus the need for the “heavier” regex which is used by the analytics process. Such values would pass the regex because it is all digits, but will fail the cast. You may not be happy about it, but it is usually necessary to be sure the values which should be numeric, can actually be cast to an integer.

Again, seemingly off-topic from the original post?

Regards,

Jason

On Fri, Nov 29, 2013 at 4:51 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

By telling these you mean data stored in DHIS2 at some implementations is not trusted, that is not cleaned enough. How can one use these data for analysis? Does it bring TRUE result for decision making? I don’t think such data should exist. At least some Data management action could clean data against data element declarations and value stored. After, there shouldn’t be such worries.

You can use regexp within SQL command to:

select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast (value as int)) as val, dv.sourceid as orgunit, aggregationtype from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

WHERE aggregationtype = ‘sum’ and domaintype=‘aggregate’ and dv.value~E’^\d+$’

group by dv.dataelementid, dv.sourceid, categoryoptioncomboid, periodid, aggregationtype

Please note this addition and dv.value~E’^\d+$', which makes SQL run failure safe.

But in overall am not happy with such data being stored without cleansing after import.

best,

murod

On Fri, Nov 29, 2013 at 7:10 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,
You would think this would work, and it might for you. The problem is there is no way to “trust” that the raw data values can be cast. We have had a lot of problems with this, but mostly because of 1) Legacy data before the different value types were implemented and 2) data which is received from mobile clients, which do not have the same restrictions as from the front-end. So, by relying on the aggregationtype, you must really know your data is clean, otherwise, it only takes one bad value for the aggregation with SQL not to work. You can see the way the analytics gets around this, by applying a regex. Whether it is “heavy” or not, I guess really depends on your reference frame.

However, getting back to the top of the thread, I thought Felix’s original issue was not being able to data which was stored as a date or text. My original point was that the casting and summing of this type of data is simply not possible with these simple SQL statements. But maybe I misunderstood Felx’s original issue.

Regards,

Jason

On Fri, Nov 29, 2013 at 3:15 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

Yes, it holds true in case if raw data being called. But query was with aggregation elements. Technique I proposed will exactly remove none convertible values from the query, thus there won’t be cast exception. No need to deploy heavy regex or stored procedure here, this is doable. Sample code here:

select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast (value as int)) as val, dv.sourceid as orgunit, aggregationtype from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

WHERE aggregationtype = ‘sum’ and domaintype=‘aggregate’

group by dv.dataelementid, dv.sourceid, categoryoptioncomboid, periodid, aggregationtype

This should give an idea on how to manage avoiding these exceptions just with plain SQL.

regards,

murod

On Fri, Nov 29, 2013 at 5:46 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Well, it is very hard to see what is going on without the entire query. My point here is there is no way to aggregate text, without some sort of procedure, so you cannot cast anything, nor can you sum it, because in the datavalue table, everything is stored as text. Just like with the analytics, you must first filter out all values which you think can be cast to a double with a regex, and then perform the sum/average/other aggregation operator. If you just want to get the raw data values back, then there should be no aggregation anyway, thus, no need for the sum operator.

Regards,
Jason

On Fri, Nov 29, 2013 at 2:39 PM, Murod Latifov mlatifov@gmail.com wrote:

I am afraid Jason, it won’t work that way. In Postgres aggregating (SUM, AVG) for varchar will give exception. There is no onbuilt type casting at this point.

On Fri, Nov 29, 2013 at 5:09 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

You cannot cast text to a double so just remove that cast operator and you should be fine.

–Sent from my mobile

On Nov 29, 2013 1:16 PM, “Felix Obareh” fobareh@softcall.co.ke wrote:

Thank Jason,
I have decided to go with the SQL Query. I am using the SQL Query example i the user manual together with the demo database for Sierra Leone.

It is working fine for elements of integer type and i can run the query from my pgAdmin and i get them fine. I have a problem with displaying data elements of text and varchar type.

Here is what am running as querry:-

------Trancated--------

left outer join organisationunit Sublocation

on (ous.idlevel6=sublocation.organisationunitid)

left outer join (

select sourceid, sum(cast(value as double precision)) as value

from datavalue

where dataelementid=8305

group by sourceid) as triggerstate on triggerstate.sourceid=ou.organisationunitid

where ous.level=7

****If i run the same query for a data element of date of text type or yes/no

***NB the triggerstate store a true or false value in the the datavalue table

HERE is the SQL error i get

ERROR: invalid input syntax for type double precision: “false”

********** Error **********

ERROR: invalid input syntax for type double precision: “false”

SQL state: 22P02


Please assist


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

On Wed, Nov 27, 2013 at 5:44 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Obare,
The reason for this is because you will only be able to see the aggregated data in DHIS2. In your case, it does not seem you need the aggregated data, but instead the raw data. One way to do this is with a custom report, which would use a JDBC data source, and then an SQL query to retreive the data. You might need some paramaters like the period and the particular orgunit. You can read about custom reports in the DHIS2 documentation. Another possible solution would be an SQL Query, which would allow you to pull out the raw data into a CSV file for subsequent analysis.

Hope this helps.

Regards,

Jason

On Wed, Nov 27, 2013 at 2:34 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Hi,

I a form similar to the Facility Assessment Form in the demo.dhis2.org . I have done some data entry but i don’t know how to get the report for that form

Among my Elements are:-

  1. Date Facility was commissioned
  1. Facility state on Commissioning with options set of ( Ready, Lacking Some Departments, Not ready)
  1. Date of Certification
  1. Date of Verification
  1. Source of Funds with option sets (Government, Donor)
  1. Is facility well maintained (Yes/No)
  1. Key issues identified & suggested solutions (Text)
  1. Number of Practitioners in the Facility

I can the above data on the entry form by i cant create a single report about the facility. The pivot tables on show the value of element 8. The value of the number of practitioners.

Any assistance will be highly appreciated. I have three more Facility based forms which i want to get reports for across the country

Cheers


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp


  1. 1-9 ↩︎

Hi Jason,

Than as I said earlier, it should be fairly simple SQL query to datavalue table, adding iner tables from which he wants DE name, Period name, etc. I don’t see any problem here. Call datavalue table, add data element name from iner join of dataelement table and so forth for others, filter on WHERE clause what you want to omit and report is ready. As for Jasper I don’t know how fat it can be productive in producing report in the layout Felix wants it. We use plain word or opendoc to express what is needed, system will decide and make necessary calls to get data in. It is also possible to perform logical and math operations as expression in the template document, like I want this DE minus that DE divided to this Indicator, or logic of “if values is this do this, otherwise do something or nothing”. Sort of “No sql”, no worries, no techie stuff. Design word doc, nice formatted tables and each cell describes what should be in there, that’s it.

best,

murod

···

On Fri, Nov 29, 2013 at 10:46 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Murod,
The issue here is that these values (text, etc) are not visible through data set reports, as they are not able to be aggregated. So Felix needs a way to get the “raw data” out, without any aggregation. Perhaps I am as well totally off-topic as well, but this was the original request as I understood it. The above query, will simply return the contents of the data value table, with a bit basic information. Likely, it would need to be modified with more fields, and pulled into a Jasper report, along with some paramaters (such as the period and orgunit of interest) along with filtering for specific data elements, which might be present in a data set. We have implemented similar types of reports, but of course, it is possible do to the same thing through the data entry screens now, when you can limit who has access to alter the data. But creating a report or pulling this “raw data” into Excel may also be needed in certain cases. This is the purpose of this query.

Regards,

Jason

On Fri, Nov 29, 2013 at 7:39 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Felix,

It is still unclear to me. If you have a combination of yes/no, custom text values you definitely need an advanced query pointing each criteria type or create indicator that brings this all in one possible value. If you want simply report these exact data just read them and print, should be fairly easy.

best,

murod

On Fri, Nov 29, 2013 at 10:05 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Murod,

Ok, let me explain myself…

Assume i am collecting data with regard to facility Assessment and my data elements

  1. “Lighting and ventilation is observed to be adequate” (YES/NO)
  1. “General state of facility” (Option Set) --i) Poor ii) Fair iii) Average
  1. “Additional notes related to facility” (string)
  1. Does the facility have Handwashing Faility (YES/NO) * This only mean we want to make sure each facility has a handwashing facility

This will be done by a District Public Health Officer. You can aggregate this about the facility. I can actually create a form for the DPHO to enter the data for his monthly facility assessment for each facility. The chief DPHO wants a report about each facility. The data is stored into the value table.

Value Count is a good idea but i don’t want a count but the actual value for analysis

Cheers

I don’t need a count but a

On Fri, Nov 29, 2013 at 7:33 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Felix,

I do not understand for large on what you mean with “raw data” but can say if you are not looking for aggregated data for the sake of SQL requirements, just use count(1 as nothing) or count(value as value) avoiding type cast issues.

best,

murod

Obare Felix
Business Systems Director

SoftCall Communication |Software Engineering & Development| WHMCS Automation | Bulk SMS | Shortcodes Systems|

General Conference Building Riverside Drive off

Chiromo Road Nxt to Australian High Commission :Tel

254 20 3520010|Fax: 254 20 4440969 Cell: 254 733 730035 |254 720 730035

www.softcall.co.ke | fobareh@softcall.co.ke Yahoo Chat:

felondri@yahoo.com| Gchat:fobareh@softcall.co.ke| Skype: softcallcom | Twitter: @smartsms_ke

On Fri, Nov 29, 2013 at 9:19 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Thank you people for you input…

If i can clarify that the query is to pick raw data that otherwise is recorded but we cannot retrieve because its not aggregated. This data would include the following

Here is my full query:-

select county.name as county,subcounty.name as subcounty,division.name as division,

sublocation.name as sublocation, ou.name as facility,triggerstate.value as triggerstate

from organisationunit ou

left outer join _orgunitstructure ous

on (ou.organisationunitid=ous.organisationunitid)

left outer join organisationunit county

on (ous.idlevel2=county.organisationunitid)

left outer join organisationunit subcounty

on (ous.idlevel3=subcounty.organisationunitid)

left outer join organisationunit division

on (ous.idlevel4= division.organisationunitid)

left outer join organisationunit location

on (ous.idlevel5=location.organisationunitid)

left outer join organisationunit Sublocation

on (ous.idlevel6=sublocation.organisationunitid)

left outer join (

select sourceid, sum(cast(value as double precision)) as value

from datavalue

where dataelementid=8305

group by sourceid) as triggerstate on triggerstate.sourceid=ou.organisationunitid

where ous.level=7

========

ERROR: invalid input syntax for type double precision: “false”

********** Error **********

ERROR: invalid input syntax for type double precision: “false”

SQL state: 22P02

======

I want to just get the list of facilities trigger states Trues/False —I dont need aggregated data because the other elements i would want to include are date, i have an element with option type (with text) also stored into value table. I need to retrieve that and create a report.

I can’t use the select sourceid, sum(cast(value as double precision)) as value — To get the raw data

A good test is to use the Siera Leone demo and create a query to display the facility assessment raw data.

  1. “General state of facility”
  1. “Lighting and ventilation is observed to be adequate”

Cheers

On Fri, Nov 29, 2013 at 6:25 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,
Just trying to keep it focused on the original topic, but it seems you want to persist. The initial part of this post was about getting out text data (read the title). In this case, there is no need for any casting or aggregation. Period.

That regex you post again unfortunately will also not work in many cases , especially if you have integers stored are decimals or negative integers which might be perfectly valid. Please refer to the source code for a more appropriate one. The data which I mention is not “dummy data”. We have lots of problems with this type of data here in Zambia as well as other countries, because most of our data is submitted through mobiles. The trunk mobile clients do not have any validation, so getting data like “09” and “9.0” for an integer is quite common. We have added some regex validation to the J2ME client to try and prevent most of this junk for getting in, but with the new Light and Smartphone clients, it does not always work. Normally, we run other scripts to flag these values for “follow-up” automatically, to try and resolve what the number is, but it can take time.

In your case, you might think that this regex suits your needs, and would want to reject all decimals and negative numbers. That is up to you. But in our case, we attempt to salvage as much data as possible. Whether that is the right approach or not, well, maybe we can start a new thread for this. :slight_smile:

Best regards,

Jason

On Fri, Nov 29, 2013 at 5:13 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

I thought my points were of some help to solve the issue and discuss those appearing from initial issue. What you thought about first call is not right and what I proposed was a solution. After you added dummy data being stored, where new topic was introduced (off initial topic). And now you are saying this kind of bureaucratic on topic/off topic things.

With regards to regexp you are not bound with what I demonstrated, you can with the same ease, please use " and dv.value~E’[1][0-9]?$'" instead if the case is so worth, no probs.

best,

murod

On Fri, Nov 29, 2013 at 7:58 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,

I do not think anyone is happy about it, and thus my blueprint to provide server-side regex validation of data, which has yet to be implemented. However, the fact of the matter is, it can happen now, and having some data is usually better than having no data at all. That regex which you mention there will also not work in all cases, because things like “00” are quite common and cannot be cast, thus the need for the “heavier” regex which is used by the analytics process. Such values would pass the regex because it is all digits, but will fail the cast. You may not be happy about it, but it is usually necessary to be sure the values which should be numeric, can actually be cast to an integer.

Again, seemingly off-topic from the original post?

Regards,

Jason

On Fri, Nov 29, 2013 at 4:51 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

By telling these you mean data stored in DHIS2 at some implementations is not trusted, that is not cleaned enough. How can one use these data for analysis? Does it bring TRUE result for decision making? I don’t think such data should exist. At least some Data management action could clean data against data element declarations and value stored. After, there shouldn’t be such worries.

You can use regexp within SQL command to:

select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast (value as int)) as val, dv.sourceid as orgunit, aggregationtype from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

WHERE aggregationtype = ‘sum’ and domaintype=‘aggregate’ and dv.value~E’^\d+$’

group by dv.dataelementid, dv.sourceid, categoryoptioncomboid, periodid, aggregationtype

Please note this addition and dv.value~E’^\d+$', which makes SQL run failure safe.

But in overall am not happy with such data being stored without cleansing after import.

best,

murod

On Fri, Nov 29, 2013 at 7:10 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,
You would think this would work, and it might for you. The problem is there is no way to “trust” that the raw data values can be cast. We have had a lot of problems with this, but mostly because of 1) Legacy data before the different value types were implemented and 2) data which is received from mobile clients, which do not have the same restrictions as from the front-end. So, by relying on the aggregationtype, you must really know your data is clean, otherwise, it only takes one bad value for the aggregation with SQL not to work. You can see the way the analytics gets around this, by applying a regex. Whether it is “heavy” or not, I guess really depends on your reference frame.

However, getting back to the top of the thread, I thought Felix’s original issue was not being able to data which was stored as a date or text. My original point was that the casting and summing of this type of data is simply not possible with these simple SQL statements. But maybe I misunderstood Felx’s original issue.

Regards,

Jason

On Fri, Nov 29, 2013 at 3:15 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

Yes, it holds true in case if raw data being called. But query was with aggregation elements. Technique I proposed will exactly remove none convertible values from the query, thus there won’t be cast exception. No need to deploy heavy regex or stored procedure here, this is doable. Sample code here:

select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast (value as int)) as val, dv.sourceid as orgunit, aggregationtype from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

WHERE aggregationtype = ‘sum’ and domaintype=‘aggregate’

group by dv.dataelementid, dv.sourceid, categoryoptioncomboid, periodid, aggregationtype

This should give an idea on how to manage avoiding these exceptions just with plain SQL.

regards,

murod

On Fri, Nov 29, 2013 at 5:46 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Well, it is very hard to see what is going on without the entire query. My point here is there is no way to aggregate text, without some sort of procedure, so you cannot cast anything, nor can you sum it, because in the datavalue table, everything is stored as text. Just like with the analytics, you must first filter out all values which you think can be cast to a double with a regex, and then perform the sum/average/other aggregation operator. If you just want to get the raw data values back, then there should be no aggregation anyway, thus, no need for the sum operator.

Regards,
Jason

On Fri, Nov 29, 2013 at 2:39 PM, Murod Latifov mlatifov@gmail.com wrote:

I am afraid Jason, it won’t work that way. In Postgres aggregating (SUM, AVG) for varchar will give exception. There is no onbuilt type casting at this point.

On Fri, Nov 29, 2013 at 5:09 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

You cannot cast text to a double so just remove that cast operator and you should be fine.

–Sent from my mobile

On Nov 29, 2013 1:16 PM, “Felix Obareh” fobareh@softcall.co.ke wrote:

Thank Jason,
I have decided to go with the SQL Query. I am using the SQL Query example i the user manual together with the demo database for Sierra Leone.

It is working fine for elements of integer type and i can run the query from my pgAdmin and i get them fine. I have a problem with displaying data elements of text and varchar type.

Here is what am running as querry:-

------Trancated--------

left outer join organisationunit Sublocation

on (ous.idlevel6=sublocation.organisationunitid)

left outer join (

select sourceid, sum(cast(value as double precision)) as value

from datavalue

where dataelementid=8305

group by sourceid) as triggerstate on triggerstate.sourceid=ou.organisationunitid

where ous.level=7

****If i run the same query for a data element of date of text type or yes/no

***NB the triggerstate store a true or false value in the the datavalue table

HERE is the SQL error i get

ERROR: invalid input syntax for type double precision: “false”

********** Error **********

ERROR: invalid input syntax for type double precision: “false”

SQL state: 22P02


Please assist


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

On Wed, Nov 27, 2013 at 5:44 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Obare,
The reason for this is because you will only be able to see the aggregated data in DHIS2. In your case, it does not seem you need the aggregated data, but instead the raw data. One way to do this is with a custom report, which would use a JDBC data source, and then an SQL query to retreive the data. You might need some paramaters like the period and the particular orgunit. You can read about custom reports in the DHIS2 documentation. Another possible solution would be an SQL Query, which would allow you to pull out the raw data into a CSV file for subsequent analysis.

Hope this helps.

Regards,

Jason

On Wed, Nov 27, 2013 at 2:34 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Hi,

I a form similar to the Facility Assessment Form in the demo.dhis2.org . I have done some data entry but i don’t know how to get the report for that form

Among my Elements are:-

  1. Date Facility was commissioned
  1. Facility state on Commissioning with options set of ( Ready, Lacking Some Departments, Not ready)
  1. Date of Certification
  1. Date of Verification
  1. Source of Funds with option sets (Government, Donor)
  1. Is facility well maintained (Yes/No)
  1. Key issues identified & suggested solutions (Text)
  1. Number of Practitioners in the Facility

I can the above data on the entry form by i cant create a single report about the facility. The pivot tables on show the value of element 8. The value of the number of practitioners.

Any assistance will be highly appreciated. I have three more Facility based forms which i want to get reports for across the country

Cheers


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp


  1. 1-9 ↩︎

Hi Felix,

Your sql query should be something like this:

select de.name as dename, ou.name as ouname, pe.startdate, dv.dataelementid, categoryoptioncomboid, pe.periodid, value as val, dv.sourceid as orgunitid from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

left join period pe

on dv.periodid = pe.periodid

left join organisationunit ou

on dv.sourceid = ou.organisationunitid

WHERE de.dataelementid in (2323, 2334, 2335) and dv.periodid=343

···

On Fri, Nov 29, 2013 at 10:05 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Murod,

Ok, let me explain myself…

Assume i am collecting data with regard to facility Assessment and my data elements

  1. “Lighting and ventilation is observed to be adequate” (YES/NO)
  1. “General state of facility” (Option Set) --i) Poor ii) Fair iii) Average
  1. “Additional notes related to facility” (string)
  1. Does the facility have Handwashing Faility (YES/NO) * This only mean we want to make sure each facility has a handwashing facility

This will be done by a District Public Health Officer. You can aggregate this about the facility. I can actually create a form for the DPHO to enter the data for his monthly facility assessment for each facility. The chief DPHO wants a report about each facility. The data is stored into the value table.

Value Count is a good idea but i don’t want a count but the actual value for analysis

Cheers

I don’t need a count but a

On Fri, Nov 29, 2013 at 7:33 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Felix,

I do not understand for large on what you mean with “raw data” but can say if you are not looking for aggregated data for the sake of SQL requirements, just use count(1 as nothing) or count(value as value) avoiding type cast issues.

best,

murod

Obare Felix
Business Systems Director

SoftCall Communication |Software Engineering & Development| WHMCS Automation | Bulk SMS | Shortcodes Systems|

General Conference Building Riverside Drive off

Chiromo Road Nxt to Australian High Commission :Tel

254 20 3520010|Fax: 254 20 4440969 Cell: 254 733 730035 |254 720 730035

www.softcall.co.ke | fobareh@softcall.co.ke Yahoo Chat:

felondri@yahoo.com| Gchat:fobareh@softcall.co.ke| Skype: softcallcom | Twitter: @smartsms_ke

On Fri, Nov 29, 2013 at 9:19 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Thank you people for you input…

If i can clarify that the query is to pick raw data that otherwise is recorded but we cannot retrieve because its not aggregated. This data would include the following

Here is my full query:-

select county.name as county,subcounty.name as subcounty,division.name as division,

sublocation.name as sublocation, ou.name as facility,triggerstate.value as triggerstate

from organisationunit ou

left outer join _orgunitstructure ous

on (ou.organisationunitid=ous.organisationunitid)

left outer join organisationunit county

on (ous.idlevel2=county.organisationunitid)

left outer join organisationunit subcounty

on (ous.idlevel3=subcounty.organisationunitid)

left outer join organisationunit division

on (ous.idlevel4= division.organisationunitid)

left outer join organisationunit location

on (ous.idlevel5=location.organisationunitid)

left outer join organisationunit Sublocation

on (ous.idlevel6=sublocation.organisationunitid)

left outer join (

select sourceid, sum(cast(value as double precision)) as value

from datavalue

where dataelementid=8305

group by sourceid) as triggerstate on triggerstate.sourceid=ou.organisationunitid

where ous.level=7

========

ERROR: invalid input syntax for type double precision: “false”

********** Error **********

ERROR: invalid input syntax for type double precision: “false”

SQL state: 22P02

======

I want to just get the list of facilities trigger states Trues/False —I dont need aggregated data because the other elements i would want to include are date, i have an element with option type (with text) also stored into value table. I need to retrieve that and create a report.

I can’t use the select sourceid, sum(cast(value as double precision)) as value — To get the raw data

A good test is to use the Siera Leone demo and create a query to display the facility assessment raw data.

  1. “General state of facility”
  1. “Lighting and ventilation is observed to be adequate”

Cheers

On Fri, Nov 29, 2013 at 6:25 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,
Just trying to keep it focused on the original topic, but it seems you want to persist. The initial part of this post was about getting out text data (read the title). In this case, there is no need for any casting or aggregation. Period.

That regex you post again unfortunately will also not work in many cases , especially if you have integers stored are decimals or negative integers which might be perfectly valid. Please refer to the source code for a more appropriate one. The data which I mention is not “dummy data”. We have lots of problems with this type of data here in Zambia as well as other countries, because most of our data is submitted through mobiles. The trunk mobile clients do not have any validation, so getting data like “09” and “9.0” for an integer is quite common. We have added some regex validation to the J2ME client to try and prevent most of this junk for getting in, but with the new Light and Smartphone clients, it does not always work. Normally, we run other scripts to flag these values for “follow-up” automatically, to try and resolve what the number is, but it can take time.

In your case, you might think that this regex suits your needs, and would want to reject all decimals and negative numbers. That is up to you. But in our case, we attempt to salvage as much data as possible. Whether that is the right approach or not, well, maybe we can start a new thread for this. :slight_smile:

Best regards,

Jason

On Fri, Nov 29, 2013 at 5:13 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

I thought my points were of some help to solve the issue and discuss those appearing from initial issue. What you thought about first call is not right and what I proposed was a solution. After you added dummy data being stored, where new topic was introduced (off initial topic). And now you are saying this kind of bureaucratic on topic/off topic things.

With regards to regexp you are not bound with what I demonstrated, you can with the same ease, please use " and dv.value~E’[1][0-9]?$'" instead if the case is so worth, no probs.

best,

murod

On Fri, Nov 29, 2013 at 7:58 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,

I do not think anyone is happy about it, and thus my blueprint to provide server-side regex validation of data, which has yet to be implemented. However, the fact of the matter is, it can happen now, and having some data is usually better than having no data at all. That regex which you mention there will also not work in all cases, because things like “00” are quite common and cannot be cast, thus the need for the “heavier” regex which is used by the analytics process. Such values would pass the regex because it is all digits, but will fail the cast. You may not be happy about it, but it is usually necessary to be sure the values which should be numeric, can actually be cast to an integer.

Again, seemingly off-topic from the original post?

Regards,

Jason

On Fri, Nov 29, 2013 at 4:51 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

By telling these you mean data stored in DHIS2 at some implementations is not trusted, that is not cleaned enough. How can one use these data for analysis? Does it bring TRUE result for decision making? I don’t think such data should exist. At least some Data management action could clean data against data element declarations and value stored. After, there shouldn’t be such worries.

You can use regexp within SQL command to:

select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast (value as int)) as val, dv.sourceid as orgunit, aggregationtype from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

WHERE aggregationtype = ‘sum’ and domaintype=‘aggregate’ and dv.value~E’^\d+$’

group by dv.dataelementid, dv.sourceid, categoryoptioncomboid, periodid, aggregationtype

Please note this addition and dv.value~E’^\d+$', which makes SQL run failure safe.

But in overall am not happy with such data being stored without cleansing after import.

best,

murod

On Fri, Nov 29, 2013 at 7:10 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,
You would think this would work, and it might for you. The problem is there is no way to “trust” that the raw data values can be cast. We have had a lot of problems with this, but mostly because of 1) Legacy data before the different value types were implemented and 2) data which is received from mobile clients, which do not have the same restrictions as from the front-end. So, by relying on the aggregationtype, you must really know your data is clean, otherwise, it only takes one bad value for the aggregation with SQL not to work. You can see the way the analytics gets around this, by applying a regex. Whether it is “heavy” or not, I guess really depends on your reference frame.

However, getting back to the top of the thread, I thought Felix’s original issue was not being able to data which was stored as a date or text. My original point was that the casting and summing of this type of data is simply not possible with these simple SQL statements. But maybe I misunderstood Felx’s original issue.

Regards,

Jason

On Fri, Nov 29, 2013 at 3:15 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

Yes, it holds true in case if raw data being called. But query was with aggregation elements. Technique I proposed will exactly remove none convertible values from the query, thus there won’t be cast exception. No need to deploy heavy regex or stored procedure here, this is doable. Sample code here:

select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast (value as int)) as val, dv.sourceid as orgunit, aggregationtype from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

WHERE aggregationtype = ‘sum’ and domaintype=‘aggregate’

group by dv.dataelementid, dv.sourceid, categoryoptioncomboid, periodid, aggregationtype

This should give an idea on how to manage avoiding these exceptions just with plain SQL.

regards,

murod

On Fri, Nov 29, 2013 at 5:46 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Well, it is very hard to see what is going on without the entire query. My point here is there is no way to aggregate text, without some sort of procedure, so you cannot cast anything, nor can you sum it, because in the datavalue table, everything is stored as text. Just like with the analytics, you must first filter out all values which you think can be cast to a double with a regex, and then perform the sum/average/other aggregation operator. If you just want to get the raw data values back, then there should be no aggregation anyway, thus, no need for the sum operator.

Regards,
Jason

On Fri, Nov 29, 2013 at 2:39 PM, Murod Latifov mlatifov@gmail.com wrote:

I am afraid Jason, it won’t work that way. In Postgres aggregating (SUM, AVG) for varchar will give exception. There is no onbuilt type casting at this point.

On Fri, Nov 29, 2013 at 5:09 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

You cannot cast text to a double so just remove that cast operator and you should be fine.

–Sent from my mobile

On Nov 29, 2013 1:16 PM, “Felix Obareh” fobareh@softcall.co.ke wrote:

Thank Jason,
I have decided to go with the SQL Query. I am using the SQL Query example i the user manual together with the demo database for Sierra Leone.

It is working fine for elements of integer type and i can run the query from my pgAdmin and i get them fine. I have a problem with displaying data elements of text and varchar type.

Here is what am running as querry:-

------Trancated--------

left outer join organisationunit Sublocation

on (ous.idlevel6=sublocation.organisationunitid)

left outer join (

select sourceid, sum(cast(value as double precision)) as value

from datavalue

where dataelementid=8305

group by sourceid) as triggerstate on triggerstate.sourceid=ou.organisationunitid

where ous.level=7

****If i run the same query for a data element of date of text type or yes/no

***NB the triggerstate store a true or false value in the the datavalue table

HERE is the SQL error i get

ERROR: invalid input syntax for type double precision: “false”

********** Error **********

ERROR: invalid input syntax for type double precision: “false”

SQL state: 22P02


Please assist


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

On Wed, Nov 27, 2013 at 5:44 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Obare,
The reason for this is because you will only be able to see the aggregated data in DHIS2. In your case, it does not seem you need the aggregated data, but instead the raw data. One way to do this is with a custom report, which would use a JDBC data source, and then an SQL query to retreive the data. You might need some paramaters like the period and the particular orgunit. You can read about custom reports in the DHIS2 documentation. Another possible solution would be an SQL Query, which would allow you to pull out the raw data into a CSV file for subsequent analysis.

Hope this helps.

Regards,

Jason

On Wed, Nov 27, 2013 at 2:34 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Hi,

I a form similar to the Facility Assessment Form in the demo.dhis2.org . I have done some data entry but i don’t know how to get the report for that form

Among my Elements are:-

  1. Date Facility was commissioned
  1. Facility state on Commissioning with options set of ( Ready, Lacking Some Departments, Not ready)
  1. Date of Certification
  1. Date of Verification
  1. Source of Funds with option sets (Government, Donor)
  1. Is facility well maintained (Yes/No)
  1. Key issues identified & suggested solutions (Text)
  1. Number of Practitioners in the Facility

I can the above data on the entry form by i cant create a single report about the facility. The pivot tables on show the value of element 8. The value of the number of practitioners.

Any assistance will be highly appreciated. I have three more Facility based forms which i want to get reports for across the country

Cheers


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp


  1. 1-9 ↩︎

Team,

Thank Jason and Murod i think this will help me a great deal. Jason you got me me right. Murad your querry is running ok too.

This thread has been very helpful.

Cheers

···

On Fri, Nov 29, 2013 at 9:13 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Felix,

Your sql query should be something like this:

select de.name as dename, ou.name as ouname, pe.startdate, dv.dataelementid, categoryoptioncomboid, pe.periodid, value as val, dv.sourceid as orgunitid from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

left join period pe

on dv.periodid = pe.periodid

left join organisationunit ou

on dv.sourceid = ou.organisationunitid

WHERE de.dataelementid in (2323, 2334, 2335) and dv.periodid=343

On Fri, Nov 29, 2013 at 10:05 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Murod,

Ok, let me explain myself…

Assume i am collecting data with regard to facility Assessment and my data elements

  1. “Lighting and ventilation is observed to be adequate” (YES/NO)
  1. “General state of facility” (Option Set) --i) Poor ii) Fair iii) Average
  1. “Additional notes related to facility” (string)
  1. Does the facility have Handwashing Faility (YES/NO) * This only mean we want to make sure each facility has a handwashing facility

This will be done by a District Public Health Officer. You can aggregate this about the facility. I can actually create a form for the DPHO to enter the data for his monthly facility assessment for each facility. The chief DPHO wants a report about each facility. The data is stored into the value table.

Value Count is a good idea but i don’t want a count but the actual value for analysis

Cheers

I don’t need a count but a

On Fri, Nov 29, 2013 at 7:33 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Felix,

I do not understand for large on what you mean with “raw data” but can say if you are not looking for aggregated data for the sake of SQL requirements, just use count(1 as nothing) or count(value as value) avoiding type cast issues.

best,

murod

On Fri, Nov 29, 2013 at 9:19 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Thank you people for you input…

If i can clarify that the query is to pick raw data that otherwise is recorded but we cannot retrieve because its not aggregated. This data would include the following

Here is my full query:-

select county.name as county,subcounty.name as subcounty,division.name as division,

sublocation.name as sublocation, ou.name as facility,triggerstate.value as triggerstate

from organisationunit ou

left outer join _orgunitstructure ous

on (ou.organisationunitid=ous.organisationunitid)

left outer join organisationunit county

on (ous.idlevel2=county.organisationunitid)

left outer join organisationunit subcounty

on (ous.idlevel3=subcounty.organisationunitid)

left outer join organisationunit division

on (ous.idlevel4= division.organisationunitid)

left outer join organisationunit location

on (ous.idlevel5=location.organisationunitid)

left outer join organisationunit Sublocation

on (ous.idlevel6=sublocation.organisationunitid)

left outer join (

select sourceid, sum(cast(value as double precision)) as value

from datavalue

where dataelementid=8305

group by sourceid) as triggerstate on triggerstate.sourceid=ou.organisationunitid

where ous.level=7

========

ERROR: invalid input syntax for type double precision: “false”

********** Error **********

ERROR: invalid input syntax for type double precision: “false”

SQL state: 22P02

======

I want to just get the list of facilities trigger states Trues/False —I dont need aggregated data because the other elements i would want to include are date, i have an element with option type (with text) also stored into value table. I need to retrieve that and create a report.

I can’t use the select sourceid, sum(cast(value as double precision)) as value — To get the raw data

A good test is to use the Siera Leone demo and create a query to display the facility assessment raw data.

  1. “General state of facility”
  1. “Lighting and ventilation is observed to be adequate”

Cheers

On Fri, Nov 29, 2013 at 6:25 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,
Just trying to keep it focused on the original topic, but it seems you want to persist. The initial part of this post was about getting out text data (read the title). In this case, there is no need for any casting or aggregation. Period.

That regex you post again unfortunately will also not work in many cases , especially if you have integers stored are decimals or negative integers which might be perfectly valid. Please refer to the source code for a more appropriate one. The data which I mention is not “dummy data”. We have lots of problems with this type of data here in Zambia as well as other countries, because most of our data is submitted through mobiles. The trunk mobile clients do not have any validation, so getting data like “09” and “9.0” for an integer is quite common. We have added some regex validation to the J2ME client to try and prevent most of this junk for getting in, but with the new Light and Smartphone clients, it does not always work. Normally, we run other scripts to flag these values for “follow-up” automatically, to try and resolve what the number is, but it can take time.

In your case, you might think that this regex suits your needs, and would want to reject all decimals and negative numbers. That is up to you. But in our case, we attempt to salvage as much data as possible. Whether that is the right approach or not, well, maybe we can start a new thread for this. :slight_smile:

Best regards,

Jason

On Fri, Nov 29, 2013 at 5:13 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

I thought my points were of some help to solve the issue and discuss those appearing from initial issue. What you thought about first call is not right and what I proposed was a solution. After you added dummy data being stored, where new topic was introduced (off initial topic). And now you are saying this kind of bureaucratic on topic/off topic things.

With regards to regexp you are not bound with what I demonstrated, you can with the same ease, please use " and dv.value~E’[1][0-9]?$'" instead if the case is so worth, no probs.

best,

murod

On Fri, Nov 29, 2013 at 7:58 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,

I do not think anyone is happy about it, and thus my blueprint to provide server-side regex validation of data, which has yet to be implemented. However, the fact of the matter is, it can happen now, and having some data is usually better than having no data at all. That regex which you mention there will also not work in all cases, because things like “00” are quite common and cannot be cast, thus the need for the “heavier” regex which is used by the analytics process. Such values would pass the regex because it is all digits, but will fail the cast. You may not be happy about it, but it is usually necessary to be sure the values which should be numeric, can actually be cast to an integer.

Again, seemingly off-topic from the original post?

Regards,

Jason

On Fri, Nov 29, 2013 at 4:51 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

By telling these you mean data stored in DHIS2 at some implementations is not trusted, that is not cleaned enough. How can one use these data for analysis? Does it bring TRUE result for decision making? I don’t think such data should exist. At least some Data management action could clean data against data element declarations and value stored. After, there shouldn’t be such worries.

You can use regexp within SQL command to:

select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast (value as int)) as val, dv.sourceid as orgunit, aggregationtype from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

WHERE aggregationtype = ‘sum’ and domaintype=‘aggregate’ and dv.value~E’^\d+$’

group by dv.dataelementid, dv.sourceid, categoryoptioncomboid, periodid, aggregationtype

Please note this addition and dv.value~E’^\d+$', which makes SQL run failure safe.

But in overall am not happy with such data being stored without cleansing after import.

best,

murod

On Fri, Nov 29, 2013 at 7:10 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Murod,
You would think this would work, and it might for you. The problem is there is no way to “trust” that the raw data values can be cast. We have had a lot of problems with this, but mostly because of 1) Legacy data before the different value types were implemented and 2) data which is received from mobile clients, which do not have the same restrictions as from the front-end. So, by relying on the aggregationtype, you must really know your data is clean, otherwise, it only takes one bad value for the aggregation with SQL not to work. You can see the way the analytics gets around this, by applying a regex. Whether it is “heavy” or not, I guess really depends on your reference frame.

However, getting back to the top of the thread, I thought Felix’s original issue was not being able to data which was stored as a date or text. My original point was that the casting and summing of this type of data is simply not possible with these simple SQL statements. But maybe I misunderstood Felx’s original issue.

Regards,

Jason

On Fri, Nov 29, 2013 at 3:15 PM, Murod Latifov mlatifov@gmail.com wrote:

Hi Jason,

Yes, it holds true in case if raw data being called. But query was with aggregation elements. Technique I proposed will exactly remove none convertible values from the query, thus there won’t be cast exception. No need to deploy heavy regex or stored procedure here, this is doable. Sample code here:

select dv.dataelementid, categoryoptioncomboid, periodid, sum(cast (value as int)) as val, dv.sourceid as orgunit, aggregationtype from datavalue dv

left join dataelement de

on dv.dataelementid = de.dataelementid

WHERE aggregationtype = ‘sum’ and domaintype=‘aggregate’

group by dv.dataelementid, dv.sourceid, categoryoptioncomboid, periodid, aggregationtype

This should give an idea on how to manage avoiding these exceptions just with plain SQL.

regards,

murod

On Fri, Nov 29, 2013 at 5:46 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Well, it is very hard to see what is going on without the entire query. My point here is there is no way to aggregate text, without some sort of procedure, so you cannot cast anything, nor can you sum it, because in the datavalue table, everything is stored as text. Just like with the analytics, you must first filter out all values which you think can be cast to a double with a regex, and then perform the sum/average/other aggregation operator. If you just want to get the raw data values back, then there should be no aggregation anyway, thus, no need for the sum operator.

Regards,
Jason

On Fri, Nov 29, 2013 at 2:39 PM, Murod Latifov mlatifov@gmail.com wrote:

I am afraid Jason, it won’t work that way. In Postgres aggregating (SUM, AVG) for varchar will give exception. There is no onbuilt type casting at this point.

On Fri, Nov 29, 2013 at 5:09 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

You cannot cast text to a double so just remove that cast operator and you should be fine.

–Sent from my mobile

On Nov 29, 2013 1:16 PM, “Felix Obareh” fobareh@softcall.co.ke wrote:

Thank Jason,
I have decided to go with the SQL Query. I am using the SQL Query example i the user manual together with the demo database for Sierra Leone.

It is working fine for elements of integer type and i can run the query from my pgAdmin and i get them fine. I have a problem with displaying data elements of text and varchar type.

Here is what am running as querry:-

------Trancated--------

left outer join organisationunit Sublocation

on (ous.idlevel6=sublocation.organisationunitid)

left outer join (

select sourceid, sum(cast(value as double precision)) as value

from datavalue

where dataelementid=8305

group by sourceid) as triggerstate on triggerstate.sourceid=ou.organisationunitid

where ous.level=7

****If i run the same query for a data element of date of text type or yes/no

***NB the triggerstate store a true or false value in the the datavalue table

HERE is the SQL error i get

ERROR: invalid input syntax for type double precision: “false”

********** Error **********

ERROR: invalid input syntax for type double precision: “false”

SQL state: 22P02


Please assist


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp

On Wed, Nov 27, 2013 at 5:44 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi Obare,
The reason for this is because you will only be able to see the aggregated data in DHIS2. In your case, it does not seem you need the aggregated data, but instead the raw data. One way to do this is with a custom report, which would use a JDBC data source, and then an SQL query to retreive the data. You might need some paramaters like the period and the particular orgunit. You can read about custom reports in the DHIS2 documentation. Another possible solution would be an SQL Query, which would allow you to pull out the raw data into a CSV file for subsequent analysis.

Hope this helps.

Regards,

Jason

On Wed, Nov 27, 2013 at 2:34 PM, Felix Obareh fobareh@softcall.co.ke wrote:

Hi,

I a form similar to the Facility Assessment Form in the demo.dhis2.org . I have done some data entry but i don’t know how to get the report for that form

Among my Elements are:-

  1. Date Facility was commissioned
  1. Facility state on Commissioning with options set of ( Ready, Lacking Some Departments, Not ready)
  1. Date of Certification
  1. Date of Verification
  1. Source of Funds with option sets (Government, Donor)
  1. Is facility well maintained (Yes/No)
  1. Key issues identified & suggested solutions (Text)
  1. Number of Practitioners in the Facility

I can the above data on the entry form by i cant create a single report about the facility. The pivot tables on show the value of element 8. The value of the number of practitioners.

Any assistance will be highly appreciated. I have three more Facility based forms which i want to get reports for across the country

Cheers


Mailing list: https://launchpad.net/~dhis2-users

Post to : dhis2-users@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-users

More help : https://help.launchpad.net/ListHelp


  1. 1-9 ↩︎