Use of the multidimensional model in reports and data analysis

Hi,

To follow up on our long debated discussion on the multidimensional model I have updated a blueprint on how to make use of this model within the existing concepts of datamart and report table. The blueprint is here:

https://blueprints.launchpad.net/dhis2/+spec/flexible-multidimensional-aggregation

It is meant to be a start for the developers in order to get out a first version as soon as possible (so that we can start to make real use of the model), and not meant to cover all needs for reporting.

Up for discussion of course.

Ola

···

Hi there. I tried to give feedback via Launchpad, but could not figure
out how to do it. A few initial thoughts now, and perhaps some more
later.

I think the blueprint is a good start. I think it captures the must
haves, but here are some of my thoughts.

It is clear that there will be group sets of data elements whose
dimensionality does not converge. One of the excel sheets I sent on
that long thread shows that there are different age group dimensions
for different data elements (which if we were using 2.0 for data entry
here in Zambia, could be implemented as category combos). Again, I do
not think in Java, so it is difficult for me to determine what is
possible. However, I have used the "tablefunc" functions of Postgres,
and one of the requirements is that you generally need to know what
the possible dimensions of a data set are in order to generate a cross
tabulated table. There are ways around this, by first providing some
type of introspection into what the possible dimensions might be
(Period, Data element, Organizational unit, Disease type, gender, age
group, etc) and then adapting the query to all of these cases. I would
assume this is how Java would handle this.

I can envision a situation where we will end up with a ragged data
set that contains lots of different dimensions. All data values should
have the data element, period, and organization unit dimensions. Some
data values may have no other dimensions other than these, while some
may be assigned category options and data element group set dimensions
(which in my view again, should be one and the same at when it comes
around to analyzing the data). So, I would like to have two types of
tables available to me.

1) A crosstabbed table that could be generated for a selected set of
data elements, similar to the aggregateddatavalues table. I guess
this should be pretty straightforward, and similar to the report
tables. Obviously, there would need to be some thought put into the
GUI, as it is currently only possible to crosstab the data on a few
dimensions. The GUI would need to be rendered at runtime, based on
what the possible dimensions contained in a given report table data
set would look like

2) A second table a fact table I gues, which I personally think would
be much more useful, would be something like this..

rowid category value
1 Value 1
1 DataElementName Confirmed cases of malaria under 5
1 Period March 2009
1 Organizational Unit Chibombo District
1 Disease Malaria
1 Transmission method Vector Borne
1 Age group Under 5
2 Value Confirmed cases of leishmaniasis under 5
2 Period March 2009
2 Organizational Unit Chibombo District
2 Disease Leishmanisais
2 Transmission method Vector borne
2 Age group Under 5

Essentially, this at least in SQL, is what can be used to create a
crosstabed table from a given data set fairly easily, assuming that I
actually know what all the "dimensions" are. I am sure this would need
to be adapted somehow, this would then allow report implementers to
aggregate data within the reports without having to unfold the
crosstab report table. The current aggregated data value table is not
going to work as the dimensions are fixed. This table structure would
offer much more flexibility to implementers building reports or OLAP
cubes, as they could slice out particular dataset through views and
pass them onto OLAP or reporting engines for further analysis. With
the above table, I could easily answer the question "How many cases of
vector borne diseases have their been in Chibombo district in March
2009?"

These are my first thoughts. More later I am sure. Enough though for tonight.

Best regards,
Jason
'

···

On Mon, Oct 26, 2009 at 5:37 PM, Ola Hodne Titlestad <olatitle@gmail.com> wrote:

Hi,

To follow up on our long debated discussion on the multidimensional model I
have updated a blueprint on how to make use of this model within the
existing concepts of datamart and report table. The blueprint is here:
https://blueprints.launchpad.net/dhis2/+spec/flexible-multidimensional-aggregation

It is meant to be a start for the developers in order to get out a first
version as soon as possible (so that we can start to make real use of the
model), and not meant to cover all needs for reporting.

Up for discussion of course.

Ola
---------

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp