Two form transcription conundrums ...

Greetings

Back in the land of the little people and the bogs. I've just
returned from a week in Rwanda (which was much too short) where I had
the opportunity to work with Arthur Heywood and some good people in
the Rwanda MOH who have done a great job at redesigning the MOH
collection instruments ie. the forms.

Most of my week was spent on infrastructure stuff but towards the end
of the week I had a (first) opportunity for me to work on the frontend
of the dhis process. This was a *really* worthwhile process for me
and I learned a great deal. Anyway, in the process of helping the
implementors move from the form design to coding the dhis dataelements
I am left with two interesting implementor design issues which I would
appreciate some input on.

The first relates to the attached picture file
(split_dataelement.png). A naive implementation of this form section,
which I suspect might also be the most common, is to make the contents
of the Diagnosis column (Mental Problems, Epilepsy, Diabetes etc) as
the datelements. And to create two categories (a Sex/Gender category
and another category for new cases, old cases, referrals and deaths).
The temptation to do this is of course very strong, because we can
then use the automatic form layout facility to present an html form
which looks like the paper form. But to me at least, this seems
clearly wrong ... in fact something of an evil.

New and old cases have nothing to do with referrals and deaths (apples
and oranges). What is actually underlying a more nuanced reading of
this form are two datelements for each of the diagnosis column eg.
Epilepsy Cases and Epilepsy Outcomes. I've tried to illustrate this
by boxing them off in the table. If I am reading this right then the
use of auto form layout is positively driving the implementor to
create both wrong dataelements and wrong categories.

In fact a deeper look at this (since drawing the picture) seems to
indicate that a better dataelement breakdown might even be "Chronic
Disease and Mental Health Cases" and "Chronic Disease and Mental
Health Outcomes" for which we have an additional category (dimension)
for the diagnosis which would be a subset of a list of ICD10 codes and
descriptions. This breakdown has the combined effects of reducing the
dataelement count as well as incorporating ICD codelists which could
be useful for cross-matching against other data later.

So as a consequence of this little bit of analysis, I have recommended
to the implementors that they *really*, *really*, *really* should
avoid auto-form layout completely as following the logic of form
layout necessarily leads them to the evil of poorly constructed
structural metadata. The issue is illustrated in the attachment, but
this pattern repeats itself on various parts of the form. But I am
surely not the only one to have observed the anti-pattern of
structuring dataelements and categories on the basis of what we want
the dataentry form to look like. I would welcome some feedback from
others on this before I lead them down this road. Do we have any good
experience from the field elsewhere on this subtle process of teasing
out the actual dataelements and dimensions from their visual
representation on forms? From the many databases I have looked at,
we certainly do have many examples of the naive, crude approach.

For the moment the implementors are prepared to put in the effort of
doing custom form design. My suggestion has been to put the form
layout on the back burner initially - this week - and focus on
identifying correct, or at least sensible, dataelements and
categories. As a carrot, I have promised to provide some simple
tooling to help make the generation of html tables and mapped cells a
bit easier. In the short term this won't be in dhis as we are too
close to release. I'm not even too sure how I'm going to do this yet,
but I am prepared to do anything to prevent then doing the
datastructuring wrong.

Right. That was conundruum number one. Conundrum number two is not
entirely unrelated ....

We are all familiar with our current problem of agegroups, or more
generally the issue of unique category option codes. There are the
usual categories like {"<5",">=5"} and {"under5","5-18",">=19"} etc.
We have a longer term plan for tying these up with concepts but that
is not targeted at the upcoming release. But after 15 cups of coffee
on Saturday afternoon it occurred to me we could have a reasonable
workaround. It seems there is nothing to prevent the implementors
having a single age group category containing *all* the age groups.
The main thing one loses by doing this is that is the auto-form layout
would be a mess (theoretically they can grey out the unused options
but that would lead to more grey than white on the form). But in
light of the discussion above regarding identifying dataelements and
dimensions, we have decided to avoid the evil of the auto form layout
anyway. In which case, the implementor can, indeed he must, select
the categoryoptioncombos to use individually anyway. So he can
constrain the agegroups to collect against in his design of the custom
html form layout which seems adequate to me.

So a question: are there any unforeseen consequences I might have
missed in creating a single age group category and using it like this?
I can't off hand think of any which are too serious, but I could be
wrong.

Final note - a strong motivation for trying to get this structural
metadata sensible is that in Rwanda, at least in the medium term,
dhis2 is likely to be acting as the authoritative health information
metadata repository for the country. This means that other 3rd party
systems will be obliged to use metadata codelists provided by dhis to
mediate exchange between quite a heterogenous mix of systems. Given
this requirement we have to somehow avoid inscribing the two dhis
idiosyncracies described above into the metadata design. I *think*
the approach described will effectively work around the problems, but
welcome some feedback.

Regards
Bob

image

Greetings

Back in the land of the little people and the bogs. I've just
returned from a week in Rwanda (which was much too short) where I had
the opportunity to work with Arthur Heywood and some good people in
the Rwanda MOH who have done a great job at redesigning the MOH
collection instruments ie. the forms.

Most of my week was spent on infrastructure stuff but towards the end
of the week I had a (first) opportunity for me to work on the frontend
of the dhis process. This was a *really* worthwhile process for me
and I learned a great deal. Anyway, in the process of helping the
implementors move from the form design to coding the dhis dataelements
I am left with two interesting implementor design issues which I would
appreciate some input on.

The first relates to the attached picture file
(split_dataelement.png). A naive implementation of this form section,
which I suspect might also be the most common, is to make the contents
of the Diagnosis column (Mental Problems, Epilepsy, Diabetes etc) as
the datelements. And to create two categories (a Sex/Gender category
and another category for new cases, old cases, referrals and deaths).
The temptation to do this is of course very strong, because we can
then use the automatic form layout facility to present an html form
which looks like the paper form. But to me at least, this seems
clearly wrong ... in fact something of an evil.

New and old cases have nothing to do with referrals and deaths (apples
and oranges). What is actually underlying a more nuanced reading of
this form are two datelements for each of the diagnosis column eg.
Epilepsy Cases and Epilepsy Outcomes. I've tried to illustrate this
by boxing them off in the table. If I am reading this right then the
use of auto form layout is positively driving the implementor to
create both wrong dataelements and wrong categories.

In fact a deeper look at this (since drawing the picture) seems to
indicate that a better dataelement breakdown might even be "Chronic
Disease and Mental Health Cases" and "Chronic Disease and Mental
Health Outcomes" for which we have an additional category (dimension)
for the diagnosis which would be a subset of a list of ICD10 codes and
descriptions. This breakdown has the combined effects of reducing the
dataelement count as well as incorporating ICD codelists which could
be useful for cross-matching against other data later.

So as a consequence of this little bit of analysis, I have recommended
to the implementors that they *really*, *really*, *really* should
avoid auto-form layout completely as following the logic of form
layout necessarily leads them to the evil of poorly constructed
structural metadata. The issue is illustrated in the attachment, but
this pattern repeats itself on various parts of the form. But I am
surely not the only one to have observed the anti-pattern of
structuring dataelements and categories on the basis of what we want
the dataentry form to look like. I would welcome some feedback from
others on this before I lead them down this road. Do we have any good
experience from the field elsewhere on this subtle process of teasing
out the actual dataelements and dimensions from their visual
representation on forms? From the many databases I have looked at,
we certainly do have many examples of the naive, crude approach.

A slightly softened position .. its not auto form layout which is
necessarily evil. With well structured dataelements, the auto layout
will produce reasonable forms. But if we want to produce forms like
the attached snippet using auto form layout, we can, but at the cost
of making silly categories.

···

On 3 October 2011 12:28, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

For the moment the implementors are prepared to put in the effort of
doing custom form design. My suggestion has been to put the form
layout on the back burner initially - this week - and focus on
identifying correct, or at least sensible, dataelements and
categories. As a carrot, I have promised to provide some simple
tooling to help make the generation of html tables and mapped cells a
bit easier. In the short term this won't be in dhis as we are too
close to release. I'm not even too sure how I'm going to do this yet,
but I am prepared to do anything to prevent then doing the
datastructuring wrong.

Right. That was conundruum number one. Conundrum number two is not
entirely unrelated ....

We are all familiar with our current problem of agegroups, or more
generally the issue of unique category option codes. There are the
usual categories like {"<5",">=5"} and {"under5","5-18",">=19"} etc.
We have a longer term plan for tying these up with concepts but that
is not targeted at the upcoming release. But after 15 cups of coffee
on Saturday afternoon it occurred to me we could have a reasonable
workaround. It seems there is nothing to prevent the implementors
having a single age group category containing *all* the age groups.
The main thing one loses by doing this is that is the auto-form layout
would be a mess (theoretically they can grey out the unused options
but that would lead to more grey than white on the form). But in
light of the discussion above regarding identifying dataelements and
dimensions, we have decided to avoid the evil of the auto form layout
anyway. In which case, the implementor can, indeed he must, select
the categoryoptioncombos to use individually anyway. So he can
constrain the agegroups to collect against in his design of the custom
html form layout which seems adequate to me.

So a question: are there any unforeseen consequences I might have
missed in creating a single age group category and using it like this?
I can't off hand think of any which are too serious, but I could be
wrong.

Final note - a strong motivation for trying to get this structural
metadata sensible is that in Rwanda, at least in the medium term,
dhis2 is likely to be acting as the authoritative health information
metadata repository for the country. This means that other 3rd party
systems will be obliged to use metadata codelists provided by dhis to
mediate exchange between quite a heterogenous mix of systems. Given
this requirement we have to somehow avoid inscribing the two dhis
idiosyncracies described above into the metadata design. I *think*
the approach described will effectively work around the problems, but
welcome some feedback.

Regards
Bob

Hi,

I think your data element/category definitions make sense in this
case. One should clearly not join the four category options you
mention into the same category here.

In Kenya we discovered that 90 % of the forms fit with the classic
data-elements-on-the-rows-and-categories-on-the-columns pattern,
meaning the auto-form function was not evil but rather quite handy.

When designing data elements it's also important to make sure it
becomes easy to produce analysis - ideally what you typically want to
look for in reports should act as data elements. Your approach here
might work fine but trying to define what sort of reports are required
before settling completely on the data elements might be smart.

Re the age categories issue I am not sure. It might work fine. Again
it might become a bit unwieldy when creating reports - you now need to
"remember" for which age category options this data is collected for
in order to avoid "blank cells" in reports/pivots. Anyway we will
improve this model soon.

Hope things went well in Rwanda.

cheers

Lars

Hi,

I think your data element/category definitions make sense in this
case. One should clearly not join the four category options you
mention into the same category here.

In Kenya we discovered that 90 % of the forms fit with the classic
data-elements-on-the-rows-and-categories-on-the-columns pattern,
meaning the auto-form function was not evil but rather quite handy.

Sure. alternatively not settling concretely on the form design before
designing the dataelements could work here.

When designing data elements it's also important to make sure it
becomes easy to produce analysis - ideally what you typically want to
look for in reports should act as data elements. Your approach here
might work fine but trying to define what sort of reports are required
before settling completely on the data elements might be smart.

Sound advice. We should look at that.

Re the age categories issue I am not sure. It might work fine. Again
it might become a bit unwieldy when creating reports - you now need to
"remember" for which age category options this data is collected for
in order to avoid "blank cells" in reports/pivots.

Yes the "remembering" is a pain. With a concept/category hierarchy
this will become easier, but for short term workaround it won't be too
bad and will be easier to convert existing data to updated design
later. Mind you, the pivot tables should be much better, not worse,
in terms of blank cells.

Anyway we will
improve this model soon.

Hope things went well in Rwanda

Thanks for the input
Bob
.

···

2011/10/3 Lars Helge Øverland <larshelge@gmail.com>:

cheers

Lars

Yes the "remembering" is a pain. With a concept/category hierarchy
this will become easier, but for short term workaround it won't be too
bad and will be easier to convert existing data to updated design
later. Mind you, the pivot tables should be much better, not worse,
in terms of blank cells.

True.