More on the data browser

Hi Hieu and Lars.

After digesting our conversation this morning, I played around with
the code and did this in StatementManagerDataBrowserStore.java

    public Integer setCountDataElementsForOrgUnitBetweenPeriods(
DataBrowserTable table, Integer orgUnitId,
        List<Integer> betweenPeriodIds )
    {
        StatementHolder holder = statementManager.getHolder();

        Integer numResults = 0;
        StringBuffer sqlsb = new StringBuffer();

        int i = 0;
        for ( Integer periodId : betweenPeriodIds )
        {
            i++;

            sqlsb
                .append( " SELECT de.dataelementid, de.name AS
DataElement, dv.value AS counts_of_aggregated_values, p.periodid AS
PeriodId, p.startDate AS ColumnHeader " );
            sqlsb.append( "FROM dataelement AS de " );
            sqlsb.append( "INNER JOIN datavalue AS dv ON
(de.dataelementid = dv.dataelementid) " );
            sqlsb.append( "INNER JOIN organisationunit AS o ON
(dv.sourceid = o.organisationunitid) " );
            sqlsb.append( "JOIN period p ON (dv.periodid = p.periodid) " );
            sqlsb.append( "WHERE o.organisationunitid = '" + orgUnitId + "' " );
            sqlsb.append( "AND dv.periodid = '" + periodId + "' " );

            sqlsb.append( i == betweenPeriodIds.size() ? "ORDER BY
PeriodId " : "\n UNION \n" );
        }
...

The results look something like this..

Data Element January 2010 February 2010 March 2010
Alive on ART after 12 months (calc) 7 6 9
Amoxicillin 125mg 5ml suspension 75ml stock out 0 0 0
Antenatal 1st visit 20 weeks or later 39 36 46
Antenatal 1st visit before 20 weeks 6 6 34
Antenatal 1st visit by woman <18 years 4 3 4
Antenatal 1st visits total (calc) 45 42 80
Antenatal client collecting HIV test results 45 42 84

The numbers here, are actual numbers, not counts. And in just a few
clicks, I can easily extract out data for a given orgunit and time
periods. Much simpler than going through the data mart/report
tables.So using this workflow, we can quickly see what data is
available for a given set of orgunits and time periods, and then
quickly drill down to the actual data. We could probably have two
modes. View data and view summary. Viewing the data would provide the
actual data, while viewing the summary could give you the counts of
the data elements, which is not really as useful once you have
constrained two dimensions dimensions (periodid, orgunit)

Now, this is not going to work for multidimensional datasets, because
there are not unique data elements, but for the Zambia database, which
does not use McDonalds (category combinations, etc) this is exactly
the sort of functionality we have been needed for a long time. A very
quick way to quickly drill down to raw data for a given orgunit/period
combination. The problem with the data browser at the moment, is that
I only get counts at the end of the drill-down. In our case, I only
get ones and zeros. 1s for anything that has been submitted, a 0 for
anything that has not. Getting to the data through the data entry
forms is painful, and it is not possible to see multiple time periods.

We might should think if this is desired functionality for others, and
how to generalize it to databases with McDonalds implemented. Does not
seem to be too difficult really, but not sure if it is desirable.

Would be good to hear with others think.

Regards,
Jason

···

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

The results look something like this..

Data Element January 2010 February 2010 March 2010
Alive on ART after 12 months (calc) 7 6 9
Amoxicillin 125mg 5ml suspension 75ml stock out 0 0 0
Antenatal 1st visit 20 weeks or later 39 36 46
Antenatal 1st visit before 20 weeks 6 6 34
Antenatal 1st visit by woman <18 years 4 3 4
Antenatal 1st visits total (calc) 45 42 80
Antenatal client collecting HIV test results 45 42 84

The numbers here, are actual numbers, not counts. And in just a few
clicks, I can easily extract out data for a given orgunit and time
periods. Much simpler than going through the data mart/report
tables.So using this workflow, we can quickly see what data is
available for a given set of orgunits and time periods, and then
quickly drill down to the actual data. We could probably have two
modes. View data and view summary. Viewing the data would provide the
actual data, while viewing the summary could give you the counts of
the data elements, which is not really as useful once you have
constrained two dimensions dimensions (periodid, orgunit)

Now, this is not going to work for multidimensional datasets, because
there are not unique data elements, but for the Zambia database, which
does not use McDonalds (category combinations, etc) this is exactly
the sort of functionality we have been needed for a long time. A very
quick way to quickly drill down to raw data for a given orgunit/period
combination. The problem with the data browser at the moment, is that
I only get counts at the end of the drill-down. In our case, I only
get ones and zeros. 1s for anything that has been submitted, a 0 for
anything that has not. Getting to the data through the data entry
forms is painful, and it is not possible to see multiple time periods.

We might should think if this is desired functionality for others, and
how to generalize it to databases with McDonalds implemented. Does not
seem to be too difficult really, but not sure if it is desirable.

I would be surprised if this was not desirable more generally.
Currently one can find the data with the data browser, make a note of
the parameters (orgunit, period), try and guess what dataset the data
belong to, and then go to Data Entry to look at what the data actually
is. Kind of contradicts the theme of Output != Input. Thanks for
doing this work, Jason.

k

···

2010/11/24 Jason Pickering <jason.p.pickering@gmail.com>:

Would be good to hear with others think.

Regards,
Jason

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp

--
Cheers,
Knut Staring

I have created a blueprint here, and think this also addresses how we
can display the categoryoptioncombo dimension, by collapsing it into
the data element dimension

https://blueprints.launchpad.net/dhis2/+spec/view-data-in-data-browser

···

2010/11/24 Knut Staring <knutst@gmail.com>:

2010/11/24 Jason Pickering <jason.p.pickering@gmail.com>:

The results look something like this..

Data Element January 2010 February 2010 March 2010
Alive on ART after 12 months (calc) 7 6 9
Amoxicillin 125mg 5ml suspension 75ml stock out 0 0 0
Antenatal 1st visit 20 weeks or later 39 36 46
Antenatal 1st visit before 20 weeks 6 6 34
Antenatal 1st visit by woman <18 years 4 3 4
Antenatal 1st visits total (calc) 45 42 80
Antenatal client collecting HIV test results 45 42 84

The numbers here, are actual numbers, not counts. And in just a few
clicks, I can easily extract out data for a given orgunit and time
periods. Much simpler than going through the data mart/report
tables.So using this workflow, we can quickly see what data is
available for a given set of orgunits and time periods, and then
quickly drill down to the actual data. We could probably have two
modes. View data and view summary. Viewing the data would provide the
actual data, while viewing the summary could give you the counts of
the data elements, which is not really as useful once you have
constrained two dimensions dimensions (periodid, orgunit)

Now, this is not going to work for multidimensional datasets, because
there are not unique data elements, but for the Zambia database, which
does not use McDonalds (category combinations, etc) this is exactly
the sort of functionality we have been needed for a long time. A very
quick way to quickly drill down to raw data for a given orgunit/period
combination. The problem with the data browser at the moment, is that
I only get counts at the end of the drill-down. In our case, I only
get ones and zeros. 1s for anything that has been submitted, a 0 for
anything that has not. Getting to the data through the data entry
forms is painful, and it is not possible to see multiple time periods.

We might should think if this is desired functionality for others, and
how to generalize it to databases with McDonalds implemented. Does not
seem to be too difficult really, but not sure if it is desirable.

I would be surprised if this was not desirable more generally.
Currently one can find the data with the data browser, make a note of
the parameters (orgunit, period), try and guess what dataset the data
belong to, and then go to Data Entry to look at what the data actually
is. Kind of contradicts the theme of Output != Input. Thanks for
doing this work, Jason.

k

Would be good to hear with others think.

Regards,
Jason

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
Cheers,
Knut Staring

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

Hi, just to clarify, is this stuff now aggregating data in time/space or showing only the “raw” data?

Lars

···

2010/11/24 Jason Pickering jason.p.pickering@gmail.com

Hi Hieu and Lars.

After digesting our conversation this morning, I played around with

the code and did this in StatementManagerDataBrowserStore.java

public Integer setCountDataElementsForOrgUnitBetweenPeriods(

DataBrowserTable table, Integer orgUnitId,

    List<Integer> betweenPeriodIds )

{

    StatementHolder holder = statementManager.getHolder();



    Integer numResults = 0;

    StringBuffer sqlsb = new StringBuffer();



    int i = 0;

    for ( Integer periodId : betweenPeriodIds )

    {

        i++;



        sqlsb

            .append(  " SELECT de.dataelementid, [de.name](http://de.name) AS

DataElement, dv.value AS counts_of_aggregated_values, p.periodid AS

PeriodId, p.startDate AS ColumnHeader " );

        sqlsb.append( "FROM dataelement AS de " );

        sqlsb.append( "INNER JOIN datavalue AS dv ON

(de.dataelementid = dv.dataelementid) " );

        sqlsb.append( "INNER JOIN organisationunit AS o ON

(dv.sourceid = o.organisationunitid) " );

        sqlsb.append( "JOIN period p ON (dv.periodid = p.periodid) " );

        sqlsb.append( "WHERE o.organisationunitid = '" + orgUnitId + "' " );

        sqlsb.append( "AND dv.periodid = '" + periodId + "' " );



        sqlsb.append( i == betweenPeriodIds.size() ? "ORDER BY

PeriodId " : “\n UNION \n” );

    }

The results look something like this…

Data Element January 2010 February 2010 March 2010

Alive on ART after 12 months (calc) 7 6 9

Amoxicillin 125mg 5ml suspension 75ml stock out 0 0 0

Antenatal 1st visit 20 weeks or later 39 36 46

Antenatal 1st visit before 20 weeks 6 6 34

Antenatal 1st visit by woman <18 years 4 3 4

Antenatal 1st visits total (calc) 45 42 80

Antenatal client collecting HIV test results 45 42 84

The numbers here, are actual numbers, not counts. And in just a few

clicks, I can easily extract out data for a given orgunit and time

periods. Much simpler than going through the data mart/report

tables.So using this workflow, we can quickly see what data is

available for a given set of orgunits and time periods, and then

quickly drill down to the actual data. We could probably have two

modes. View data and view summary. Viewing the data would provide the

actual data, while viewing the summary could give you the counts of

the data elements, which is not really as useful once you have

constrained two dimensions dimensions (periodid, orgunit)

Now, this is not going to work for multidimensional datasets, because

there are not unique data elements, but for the Zambia database, which

does not use McDonalds (category combinations, etc) this is exactly

the sort of functionality we have been needed for a long time. A very

quick way to quickly drill down to raw data for a given orgunit/period

combination. The problem with the data browser at the moment, is that

I only get counts at the end of the drill-down. In our case, I only

get ones and zeros. 1s for anything that has been submitted, a 0 for

anything that has not. Getting to the data through the data entry

forms is painful, and it is not possible to see multiple time periods.

We might should think if this is desired functionality for others, and

how to generalize it to databases with McDonalds implemented. Does not

seem to be too difficult really, but not sure if it is desirable.

Would be good to hear with others think.

Raw data only.

The idea is just to provide a quick way for people to access the raw
data, without having to go through the data entry, which is kinda
stupid anyway. The PivotTable would be nice, but it only works on
aggregate data. I suppose I could do a datamart for all
orgunit/timeperiods/dataelements but this seems wasteful as well.
With the combination of the data summary (to see where you actually
have data), the drill down functionality (to get finer details), and
the ability to display raw data for a single orgunit for multiple time
periods, we have a way for users to quickly see where there are gaps
in the data, as well as to pull out raw data when needed. A bit more
functionality, like the ability to filter by dataset/data element
group, and with similar functionality in other modes (for instance, to
look at a single data element for multiple orgunits and time periods),
just provides a quick way to get to the raw data without having to
wait for a DataMart/ReportTable to execute.

I guess I just keep thinking that I should be able to browse data in
the data browser.

Attached is my version of 2.0.5 code which demonstrates this. Tested
on the Zambia database. Use at your own peril.

Regards,
Jason

Regards,
Jason

StatementManagerDataBrowserStore.java (26.9 KB)

···

On 11/25/10, Lars Helge Øverland <larshelge@gmail.com> wrote:

2010/11/24 Jason Pickering <jason.p.pickering@gmail.com>

Hi Hieu and Lars.

After digesting our conversation this morning, I played around with
the code and did this in StatementManagerDataBrowserStore.java

   public Integer setCountDataElementsForOrgUnitBetweenPeriods(
DataBrowserTable table, Integer orgUnitId,
       List<Integer> betweenPeriodIds )
   {
       StatementHolder holder = statementManager.getHolder();

       Integer numResults = 0;
       StringBuffer sqlsb = new StringBuffer();

       int i = 0;
       for ( Integer periodId : betweenPeriodIds )
       {
           i++;

           sqlsb
               .append( " SELECT de.dataelementid, de.name AS
DataElement, dv.value AS counts_of_aggregated_values, p.periodid AS
PeriodId, p.startDate AS ColumnHeader " );
           sqlsb.append( "FROM dataelement AS de " );
           sqlsb.append( "INNER JOIN datavalue AS dv ON
(de.dataelementid = dv.dataelementid) " );
           sqlsb.append( "INNER JOIN organisationunit AS o ON
(dv.sourceid = o.organisationunitid) " );
           sqlsb.append( "JOIN period p ON (dv.periodid = p.periodid) " );
           sqlsb.append( "WHERE o.organisationunitid = '" + orgUnitId + "'
" );
           sqlsb.append( "AND dv.periodid = '" + periodId + "' " );

           sqlsb.append( i == betweenPeriodIds.size() ? "ORDER BY
PeriodId " : "\n UNION \n" );
       }
...

The results look something like this..

Data Element January 2010 February 2010 March 2010
Alive on ART after 12 months (calc) 7 6 9
Amoxicillin 125mg 5ml suspension 75ml stock out 0 0 0
Antenatal 1st visit 20 weeks or later 39 36 46
Antenatal 1st visit before 20 weeks 6 6 34
Antenatal 1st visit by woman <18 years 4 3 4
Antenatal 1st visits total (calc) 45 42 80
Antenatal client collecting HIV test results 45 42 84

The numbers here, are actual numbers, not counts. And in just a few
clicks, I can easily extract out data for a given orgunit and time
periods. Much simpler than going through the data mart/report
tables.So using this workflow, we can quickly see what data is
available for a given set of orgunits and time periods, and then
quickly drill down to the actual data. We could probably have two
modes. View data and view summary. Viewing the data would provide the
actual data, while viewing the summary could give you the counts of
the data elements, which is not really as useful once you have
constrained two dimensions dimensions (periodid, orgunit)

Now, this is not going to work for multidimensional datasets, because
there are not unique data elements, but for the Zambia database, which
does not use McDonalds (category combinations, etc) this is exactly
the sort of functionality we have been needed for a long time. A very
quick way to quickly drill down to raw data for a given orgunit/period
combination. The problem with the data browser at the moment, is that
I only get counts at the end of the drill-down. In our case, I only
get ones and zeros. 1s for anything that has been submitted, a 0 for
anything that has not. Getting to the data through the data entry
forms is painful, and it is not possible to see multiple time periods.

We might should think if this is desired functionality for others, and
how to generalize it to databases with McDonalds implemented. Does not
seem to be too difficult really, but not sure if it is desirable.

Would be good to hear with others think.

Hi, just to clarify, is this stuff now aggregating data in time/space or
showing only the "raw" data?

Lars

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

This sounds great and will be much appreciated by many users I am sure.

Ola

···

2010/11/25 Jason Pickering jason.p.pickering@gmail.com

Raw data only.

The idea is just to provide a quick way for people to access the raw

data, without having to go through the data entry, which is kinda

stupid anyway. The PivotTable would be nice, but it only works on

aggregate data. I suppose I could do a datamart for all

orgunit/timeperiods/dataelements but this seems wasteful as well.

With the combination of the data summary (to see where you actually

have data), the drill down functionality (to get finer details), and

the ability to display raw data for a single orgunit for multiple time

periods, we have a way for users to quickly see where there are gaps

in the data, as well as to pull out raw data when needed. A bit more

functionality, like the ability to filter by dataset/data element

group, and with similar functionality in other modes (for instance, to

look at a single data element for multiple orgunits and time periods),

just provides a quick way to get to the raw data without having to

wait for a DataMart/ReportTable to execute.


I guess I just keep thinking that I should be able to browse data in

the data browser.

Attached is my version of 2.0.5 code which demonstrates this. Tested

on the Zambia database. Use at your own peril.

Regards,

Jason

Regards,

Jason

On 11/25/10, Lars Helge Øverland larshelge@gmail.com wrote:

2010/11/24 Jason Pickering jason.p.pickering@gmail.com

Hi Hieu and Lars.

After digesting our conversation this morning, I played around with

the code and did this in StatementManagerDataBrowserStore.java

public Integer setCountDataElementsForOrgUnitBetweenPeriods(

DataBrowserTable table, Integer orgUnitId,

   List<Integer> betweenPeriodIds )

{

   StatementHolder holder = statementManager.getHolder();
   Integer numResults = 0;
   StringBuffer sqlsb = new StringBuffer();
   int i = 0;
   for ( Integer periodId : betweenPeriodIds )
   {
       i++;
       sqlsb
           .append(  " SELECT de.dataelementid, [de.name](http://de.name) AS

DataElement, dv.value AS counts_of_aggregated_values, p.periodid AS

PeriodId, p.startDate AS ColumnHeader " );

       sqlsb.append( "FROM dataelement AS de " );
       sqlsb.append( "INNER JOIN datavalue AS dv ON

(de.dataelementid = dv.dataelementid) " );

       sqlsb.append( "INNER JOIN organisationunit AS o ON

(dv.sourceid = o.organisationunitid) " );

       sqlsb.append( "JOIN period p ON (dv.periodid = p.periodid) " );
       sqlsb.append( "WHERE o.organisationunitid = '" + orgUnitId + "'

" );

       sqlsb.append( "AND dv.periodid = '" + periodId + "' " );
       sqlsb.append( i == betweenPeriodIds.size() ? "ORDER BY

PeriodId " : “\n UNION \n” );

   }

The results look something like this…

Data Element January 2010 February 2010 March 2010

Alive on ART after 12 months (calc) 7 6 9

Amoxicillin 125mg 5ml suspension 75ml stock out 0 0 0

Antenatal 1st visit 20 weeks or later 39 36 46

Antenatal 1st visit before 20 weeks 6 6 34

Antenatal 1st visit by woman <18 years 4 3 4

Antenatal 1st visits total (calc) 45 42 80

Antenatal client collecting HIV test results 45 42 84

The numbers here, are actual numbers, not counts. And in just a few

clicks, I can easily extract out data for a given orgunit and time

periods. Much simpler than going through the data mart/report

tables.So using this workflow, we can quickly see what data is

available for a given set of orgunits and time periods, and then

quickly drill down to the actual data. We could probably have two

modes. View data and view summary. Viewing the data would provide the

actual data, while viewing the summary could give you the counts of

the data elements, which is not really as useful once you have

constrained two dimensions dimensions (periodid, orgunit)

Now, this is not going to work for multidimensional datasets, because

there are not unique data elements, but for the Zambia database, which

does not use McDonalds (category combinations, etc) this is exactly

the sort of functionality we have been needed for a long time. A very

quick way to quickly drill down to raw data for a given orgunit/period

combination. The problem with the data browser at the moment, is that

I only get counts at the end of the drill-down. In our case, I only

get ones and zeros. 1s for anything that has been submitted, a 0 for

anything that has not. Getting to the data through the data entry

forms is painful, and it is not possible to see multiple time periods.

We might should think if this is desired functionality for others, and

how to generalize it to databases with McDonalds implemented. Does not

seem to be too difficult really, but not sure if it is desirable.

Would be good to hear with others think.

Hi, just to clarify, is this stuff now aggregating data in time/space or

showing only the “raw” data?

Lars

Jason P. Pickering

email: jason.p.pickering@gmail.com

tel:+260968395190


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp