dxf2 data

Just emerging from implementing data exchange with iHRIS HR system in
Kenya. The attached is my first attempt at a working draft of
documenting what is going on re the data exchange format in the
importexport service module. I'll put it under the documentation
project as soon as I've firmed up some edges, but meanwhile please
feel free to comment.

Regards
Bob

dxf2_doc.pdf (22.3 KB)

Thanks Bob for the report.

···

On Tue, Sep 13, 2011 at 3:51 PM, Bob Jolliffe bobjolliffe@gmail.com wrote:

Just emerging from implementing data exchange with iHRIS HR system in

Kenya. The attached is my first attempt at a working draft of

documenting what is going on re the data exchange format in the

importexport service module. I’ll put it under the documentation

project as soon as I’ve firmed up some edges, but meanwhile please

feel free to comment.

Regards

Bob


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Samuel Cheburet
Ministry Of Health
P.O. Box 20781
Nairobi, Kenya
Mobile- 0721624338

Don’t Compromise The Quality! Don’t Risk It! apply Available Standards to Achieve Your/organizational Goal.

Hi Bob,

thanks for the well-written document, I think it is very sensible and
clearly describes the direction where we want to move. Agree fully on
the points on separation of meta- and data, idscheme, and period
representation.

My only comment is about the representation of dimensions by
"exploding" our category model onto optional attributes. This is good
when dealing with third-party systems but do represent an overhead
when communicating between dhis systems (including mobile). I thought
SDMX-HD was our format for communicating with third-party sources - if
dxf becomes "third-party friendly" then where does that leave SDMX?

Lars

Hi Bob,

thanks for the well-written document, I think it is very sensible and
clearly describes the direction where we want to move. Agree fully on
the points on separation of meta- and data, idscheme, and period
representation.

My only comment is about the representation of dimensions by
"exploding" our category model onto optional attributes. This is good
when dealing with third-party systems but do represent an overhead
when communicating between dhis systems (including mobile).

Good question. My sense is that the categoryoption combo is an
internal representation of dimensionality within dhis which should not
*necessarily* be exposed to 3rd parties. I think this also can and
does include mobile use cases. In fact Jo has been been on my case to
implement the dimensions particularly with a mobile use case in mind.

Having said that, there is also nothing which prevents an
application-specific datavalueset from using a catoptcombo attribute,
and we can certainly support that internally. The web-api currently
does this for example.

I'm currently thinking about how best to export the structural
metadata (codelists and the like) required to compose datavalue
messages. I need to have a bit of discussion about how the data
dictionary is envisaged, but it seems that one should be able to
browse the datadictionary and export relevant codelists from that.
And/or pull them from web-api. It could well be that it makes sense
to export a dimensionset codelist along the lines of (shooting from
the hip):

<dxf:dimensionsets>
   <dxf:dimensionset id="345" SEX="M" AGE="under5" />
   <dxf:dimensionset id="346" SEX="F" AGE="under5" />
   <dxf:dimensionset id="347" SEX="M" AGE="5andOver" />
   <dxf:dimensionset id="348" SEX="F" AGE="5andOver" />
</dxf:dimensionset>

Which might allow an abbreviated form of datavalue along the lines of:

<dxf:dataValue datelement="45" orgunit="42" "period="201001/P2M"
dimensionset="345" value="56" />

(Note the funny period - that's Jan/Feb 2010). I think this explicit
approach would be more grokkable by external systems (including our
own tightly coupled ones) than transmitting the entire lattice of
categoryoptioncombo links, and would be quite familiar with xslt
programmers familair with the attributeset construct in xslt .

Of course the downside being that is yet another codelist to transmit
and perhaps something else to model on the other side. That seems to
be the eternal compromise. What you gain on the swings you lose on
the roundabouts :slight_smile:

I thought
SDMX-HD was our format for communicating with third-party sources - if
dxf becomes "third-party friendly" then where does that leave SDMX?

That could well turn out to be an interesting question given the
rumblings within WHO :slight_smile:

But having a close mapping from (the useful parts of) sdmx to dxf is
very convenient, and in the process makes our dxf representation of
datavalues more sensible I think.

Bob

···

2011/9/19 Lars Helge Øverland <larshelge@gmail.com>:

Lars

My only comment is about the representation of dimensions by
"exploding" our category model onto optional attributes. This is good
when dealing with third-party systems but do represent an overhead
when communicating between dhis systems (including mobile).

Good question. My sense is that the categoryoption combo is an
internal representation of dimensionality within dhis which should not
*necessarily* be exposed to 3rd parties. I think this also can and
does include mobile use cases. In fact Jo has been been on my case to
implement the dimensions particularly with a mobile use case in mind.

Do note, though, that the use case for now is mostly about sending data value sets by way of a server.

If we were to make an external api with direct mobile access in mind, it would seem the openrosa api would be the way to do that (judging by the existing set of clients out there).

<dxf:dimensionsets>
  <dxf:dimensionset id="345" SEX="M" AGE="under5" />
  <dxf:dimensionset id="346" SEX="F" AGE="under5" />
  <dxf:dimensionset id="347" SEX="M" AGE="5andOver" />
  <dxf:dimensionset id="348" SEX="F" AGE="5andOver" />
</dxf:dimensionset>

This is what we do currently with the "inhouse" mobile clients, although with the uglier not-attribute format (don't trust the attribute naming). Note also that we have the concept of "greying" of certain catoptcombos for specific data sets, and to support that (haven't implemented it for mobile, but should..) these combos really are specific for each data set.

(Note the funny period - that's Jan/Feb 2010). I think this explicit
approach would be more grokkable by external systems (including our
own tightly coupled ones) than transmitting the entire lattice of
categoryoptioncombo links, and would be quite familiar with xslt
programmers familair with the attributeset construct in xslt .

If we can enforce safe attribute names and there is an easy enough way to link this extendable attribute set through jaxb, I'd use this notation over the generic one. I think we should try to be consistent, though, and I guess we need to decide now if this xml-naming-constraint is something we want to enforce. Morten's attributes have the same issue...

I thought SDMX-HD was our format for communicating with third-party sources - if
dxf becomes "third-party friendly" then where does that leave SDMX?

That could well turn out to be an interesting question given the
rumblings within WHO :slight_smile:

But having a close mapping from (the useful parts of) sdmx to dxf is
very convenient, and in the process makes our dxf representation of
datavalues more sensible I think.

I kind of end up with the same question, sometimes. Sdmx-hd doesn't specify any web api model, though, and we need to think through these representation problems anyway. So let's see where it leads us. For now I can't really see any obvious reason a generalized dxf couldn't give us what we need, both in terms of being "third-party friendly", map easily to sdmx and be efficient for internal use. Though where to draw the line between generalizing and keeping our specific domain logic can be tricky.

Jo

···

Den 19. sep. 2011 kl. 12.21 skrev Bob Jolliffe:

2011/9/19 Lars Helge Øverland <larshelge@gmail.com>:

Hi Bob,

Regarding multidimentionality of Data Elements I have a suggestion. This double identifier for DE should be removed at all. For me each instance of DE/Categoryoption combination is a real Data Element. So I suggest to generate another set of unique ids based on DE/Categoryoption combination. So lets call current DEs categories and categoryoptions as options. Each relation of them would be DE with its unique id. In datavalue table we than will have three keys: DE, Period, OrgUnit (categoryoptioncomboid would be missing). This will eliminate lots of issues, like one presented here. This could be neatly brought into play with full backward compatibility i guess.

Another suggestion is that we have almost the same type of objects in metadata representation. These also could be unified into one the same way as java containers. These objects are groups and sets of groups in DE, Indicator, OrgUnits, etc. If we create a container whos sole purpose is to group elements of some type of object, we could achieve this. I know this is out of scope of this discussion, but as we are talking on generalization, I thought would be good to mention here.

Cheers,
murod

···

2011/9/19 Bob Jolliffe bobjolliffe@gmail.com

2011/9/19 Lars Helge Øverland larshelge@gmail.com:

Hi Bob,

thanks for the well-written document, I think it is very sensible and

clearly describes the direction where we want to move. Agree fully on

the points on separation of meta- and data, idscheme, and period

representation.

My only comment is about the representation of dimensions by

“exploding” our category model onto optional attributes. This is good

when dealing with third-party systems but do represent an overhead

when communicating between dhis systems (including mobile).

Good question. My sense is that the categoryoption combo is an

internal representation of dimensionality within dhis which should not

necessarily be exposed to 3rd parties. I think this also can and

does include mobile use cases. In fact Jo has been been on my case to

implement the dimensions particularly with a mobile use case in mind.

Having said that, there is also nothing which prevents an

application-specific datavalueset from using a catoptcombo attribute,

and we can certainly support that internally. The web-api currently

does this for example.

I’m currently thinking about how best to export the structural

metadata (codelists and the like) required to compose datavalue

messages. I need to have a bit of discussion about how the data

dictionary is envisaged, but it seems that one should be able to

browse the datadictionary and export relevant codelists from that.

And/or pull them from web-api. It could well be that it makes sense

to export a dimensionset codelist along the lines of (shooting from

the hip):

dxf:dimensionsets

<dxf:dimensionset id=“345” SEX=“M” AGE=“under5” />

<dxf:dimensionset id=“346” SEX=“F” AGE=“under5” />

<dxf:dimensionset id=“347” SEX=“M” AGE=“5andOver” />

<dxf:dimensionset id=“348” SEX=“F” AGE=“5andOver” />

</dxf:dimensionset>

Which might allow an abbreviated form of datavalue along the lines of:

<dxf:dataValue datelement=“45” orgunit=“42” "period=“201001/P2M”

dimensionset=“345” value=“56” />

(Note the funny period - that’s Jan/Feb 2010). I think this explicit

approach would be more grokkable by external systems (including our

own tightly coupled ones) than transmitting the entire lattice of

categoryoptioncombo links, and would be quite familiar with xslt

programmers familair with the attributeset construct in xslt .

Of course the downside being that is yet another codelist to transmit

and perhaps something else to model on the other side. That seems to

be the eternal compromise. What you gain on the swings you lose on

the roundabouts :slight_smile:

I thought

SDMX-HD was our format for communicating with third-party sources - if

dxf becomes “third-party friendly” then where does that leave SDMX?

That could well turn out to be an interesting question given the

rumblings within WHO :slight_smile:

But having a close mapping from (the useful parts of) sdmx to dxf is

very convenient, and in the process makes our dxf representation of

datavalues more sensible I think.

Bob

Lars


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Hi Bob,

Regarding multidimentionality of Data Elements I have a suggestion. This
double identifier for DE should be removed at all. For me each instance of
DE/Categoryoption combination is a real Data Element. So I suggest to
generate another set of unique ids based on DE/Categoryoption combination.
So lets call current DEs categories and categoryoptions as options. Each
relation of them would be DE with its unique id. In datavalue table we than
will have three keys: DE, Period, OrgUnit (categoryoptioncomboid would be
missing). This will eliminate lots of issues, like one presented here. This
could be neatly brought into play with full backward compatibility i guess.

Interesting suggestion. A sort of presudo dataelement where ALL the
keys, including the DE, are collapsed into one. Sort of extreme
mcdonalds. Which could work well for some use cases and less well for
others. It depends largely on the shape of the data from the other
system. So for the javarosa case which Jo has brought up, the
instance data on a form is a fairly simple instance of the underlying
xforms model, which as far as I know has no inherent dimensionality.
So in that case it would be fine. I would worry about maintaining the
integrity of the database within dhis mind you. We already struggle a
bit with integrity of vanilla categoryoptioncombos. This uber version
might make the tangle even more tangled :slight_smile:

If the data is coming from a system with a dimensional model (eg
disaggregating by sex or age) then it is not so tidy - some mapping
has to be done somewhere and codelists need to be exchanged.

Another suggestion is that we have almost the same type of objects in
metadata representation. These also could be unified into one the same way
as java containers. These objects are groups and sets of groups in DE,
Indicator, OrgUnits, etc. If we create a container whos sole purpose is to
group elements of some type of object, we could achieve this. I know this is
out of scope of this discussion, but as we are talking on generalization, I
thought would be good to mention here.

Generalization of the group/groupset notion and retreat from the
categoryoptions brings us back a bit to the zooks thread - that
categoryoptions best use is for laying out forms, not for dimensioning
of data. I'm not sure if I have energy to back there yet :slight_smile: Mind
you, I think there could be lots of benefits of generalizing
group/groupsets, regardless.

For now I am grappling with categorycombo ...

Cheers
Bob

···

2011/9/19 Murod Latifov <mlatifov@gmail.com>:

Cheers,
murod

2011/9/19 Bob Jolliffe <bobjolliffe@gmail.com>

2011/9/19 Lars Helge Øverland <larshelge@gmail.com>:
> Hi Bob,
>
> thanks for the well-written document, I think it is very sensible and
> clearly describes the direction where we want to move. Agree fully on
> the points on separation of meta- and data, idscheme, and period
> representation.
>
> My only comment is about the representation of dimensions by
> "exploding" our category model onto optional attributes. This is good
> when dealing with third-party systems but do represent an overhead
> when communicating between dhis systems (including mobile).

Good question. My sense is that the categoryoption combo is an
internal representation of dimensionality within dhis which should not
*necessarily* be exposed to 3rd parties. I think this also can and
does include mobile use cases. In fact Jo has been been on my case to
implement the dimensions particularly with a mobile use case in mind.

Having said that, there is also nothing which prevents an
application-specific datavalueset from using a catoptcombo attribute,
and we can certainly support that internally. The web-api currently
does this for example.

I'm currently thinking about how best to export the structural
metadata (codelists and the like) required to compose datavalue
messages. I need to have a bit of discussion about how the data
dictionary is envisaged, but it seems that one should be able to
browse the datadictionary and export relevant codelists from that.
And/or pull them from web-api. It could well be that it makes sense
to export a dimensionset codelist along the lines of (shooting from
the hip):

<dxf:dimensionsets>
<dxf:dimensionset id="345" SEX="M" AGE="under5" />
<dxf:dimensionset id="346" SEX="F" AGE="under5" />
<dxf:dimensionset id="347" SEX="M" AGE="5andOver" />
<dxf:dimensionset id="348" SEX="F" AGE="5andOver" />
</dxf:dimensionset>

Which might allow an abbreviated form of datavalue along the lines of:

<dxf:dataValue datelement="45" orgunit="42" "period="201001/P2M"
dimensionset="345" value="56" />

(Note the funny period - that's Jan/Feb 2010). I think this explicit
approach would be more grokkable by external systems (including our
own tightly coupled ones) than transmitting the entire lattice of
categoryoptioncombo links, and would be quite familiar with xslt
programmers familair with the attributeset construct in xslt .

Of course the downside being that is yet another codelist to transmit
and perhaps something else to model on the other side. That seems to
be the eternal compromise. What you gain on the swings you lose on
the roundabouts :slight_smile:

>I thought
> SDMX-HD was our format for communicating with third-party sources - if
> dxf becomes "third-party friendly" then where does that leave SDMX?

That could well turn out to be an interesting question given the
rumblings within WHO :slight_smile:

But having a close mapping from (the useful parts of) sdmx to dxf is
very convenient, and in the process makes our dxf representation of
datavalues more sensible I think.

Bob

>
> Lars
>

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp