[OPENMRS-IMPLEMENTERS] x-forms and remote formentry module

Romain-Rolland_TOHOU · 27 November 2010 13:47

Hi Ola,

there is also Google gear technology that is based on javascrip and allows to create an offline version of a web application that is sync automatically with the server version when Internet connexion is available. There is also some tools to convert java gear javascript code…

Hope this can help,

Regards,

Romain

···

–
Dr Tohouri Romain-Rolland
www.tohouri.com

Hi,

Anything to learn/reuse from OMRS when it comes to offline data entry and synching with server? (see email below).

“Semi-online” data entry is much needed in online user environments where the Internet connection is unstable and data entry is sometimes better done in bulk jobs offline + a synch with server when the connection is available (and the cost of installing a full offline DHIS as a backup solution is too high). This issue has come up in Kenya.

To me the easiest and most short term solution seems to be Excel data entry offline (in restricted and standardised worksheets) and then import to server using e.g. transforms to DXF. If there is something to reuse from OMRS, that might speed up the process of developing a more sophisticated solution.

What do you think?

Ola,

Ola Hodne Titlestad (Mr)

HISP

Department of Informatics

University of Oslo

Mobile: +47 48069736

Home address: Vetlandsvn. 95B, 0685 Oslo, Norway. Googlemaps link

Jo_Storset · 27 November 2010 14:59

Actually Gears was deprecated in favour of html5 some time ago...

Jo

···

Den 27. nov. 2010 kl. 19.17 skrev Romain-Rolland TOHOURI:

Hi Ola,
there is also Google gear technology that is based on javascrip and allows to create an offline version of a web application that is sync automatically with the server version when Internet connexion is available. There is also some tools to convert java gear javascript code...
Hope this can help,

Knut_Staring · 27 November 2010 15:13

Hi Ola,
there is also Google gear technology that is based on javascrip and allows to create an offline version of a web application that is sync automatically with the server version when Internet connexion is available. There is also some tools to convert java gear javascript code...
Hope this can help,

Actually Gears was deprecated in favour of html5 some time ago...

It is interesting to look at HTML5 for offline storage and synching,
however, it seems to be still early days and need for standardization.
This article spells out the current status (as of half a year ago) and
mentions options like serializing JSON with PersistJS:

XSLTforms is another solution in this space:
http://www.agencexml.com/xsltforms

Knut

···

On Sat, Nov 27, 2010 at 3:59 PM, Jo Størset <storset@gmail.com> wrote:

Den 27. nov. 2010 kl. 19.17 skrev Romain-Rolland TOHOURI:

Jo

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
Cheers,
Knut Staring

olatitle · 27 November 2010 15:25

Exactly since HTML5 and the like seem far off at the moment I wanted to ask whether we could save some time by reusing the OMRS approach. Saptarshi, are you familiar with it? Any point in trying to reuse in DHIS 2?

Ola

···

On 27 November 2010 16:13, Knut Staring knutst@gmail.com wrote:

On Sat, Nov 27, 2010 at 3:59 PM, Jo Størset storset@gmail.com wrote:

Den 27. nov. 2010 kl. 19.17 skrev Romain-Rolland TOHOURI:

Hi Ola,

there is also Google gear technology that is based on javascrip and allows to create an offline version of a web application that is sync automatically with the server version when Internet connexion is available. There is also some tools to convert java gear javascript code…

Hope this can help,

Actually Gears was deprecated in favour of html5 some time ago…

It is interesting to look at HTML5 for offline storage and synching,

however, it seems to be still early days and need for standardization.

This article spells out the current status (as of half a year ago) and

mentions options like serializing JSON with PersistJS:

http://rethink.unspace.ca/2010/5/10/the-state-of-html5-local-data-storage

XSLTforms is another solution in this space:

http://www.agencexml.com/xsltforms

Knut

Jo

Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

–

Cheers,

Knut Staring

Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Knut_Staring · 27 November 2010 15:44

On this topic, SitePen is creating all the XML goodies like Xpath etc for JSON:

Persevere and Pintura provide offline, local storage, server-synching,
RESTful, JSON database solution with JSONPath/JSONQuery support,
avoiding to manually comb through stuff. JSONSchema gives an Object
model of your data.

http://www.persvr.org/Page/Persevere

Knut

···

On Sat, Nov 27, 2010 at 4:25 PM, Ola Hodne Titlestad <olati@ifi.uio.no> wrote:

On 27 November 2010 16:13, Knut Staring <knutst@gmail.com> wrote:

On Sat, Nov 27, 2010 at 3:59 PM, Jo Størset <storset@gmail.com> wrote:
>
> Den 27. nov. 2010 kl. 19.17 skrev Romain-Rolland TOHOURI:
>
>> Hi Ola,
>> there is also Google gear technology that is based on javascrip and
>> allows to create an offline version of a web application that is sync
>> automatically with the server version when Internet connexion is available.
>> There is also some tools to convert java gear javascript code...
>> Hope this can help,
>
> Actually Gears was deprecated in favour of html5 some time ago...

It is interesting to look at HTML5 for offline storage and synching,
however, it seems to be still early days and need for standardization.
This article spells out the current status (as of half a year ago) and
mentions options like serializing JSON with PersistJS:
unspace.ca

XSLTforms is another solution in this space:
http://www.agencexml.com/xsltforms

bobj · 27 November 2010 17:03

Currently it is possible to feed any type of xml based data into dhis
via import/export module. Currently configured is dxf import and
sdmx-hd in 2.0.5 but others can be implemented on the fly. But there
are still 3 areas to be addressed to make this process "better":

(i) authentication - currently cookies/html form based which is
awkward. Watching what Jo is doing with basic auth with interest.
The basic auth approach is probably going to be the simplest workable
solution here and we will leverage on what the mobile folk are doing
here.

(ii) management of identifiers/metadata. You can't have reliable
off-site 3rd party client software producing datavaluesets without a
robust exchange of metadata identifiers and codelists. My feeling on
this, which has grown over the past year and a half, is in fact to
stabilize the integer identifiers within a country context. So within
a domain of say Kenya or Tanzania or Sierra Leone, the dataelement
"New malaria cases" should have a fixed identitier (say 42, the
meaning of life). I think the main implication of this is might be
either to do away completely with auto generated ids as primary key or
to also persist an authoritative id - you shouldn't be able to create
a new identifiable object (dataelement, indicator, category or what
have you) without *assigning* it an id. And to do that you must
assume some sort of authority over that metadata element - ie
ownership. In an environment where you might want to create
dataelements from lower down the hierarchy then you either also have
to have an "authority" identifier, or we need to partition the range -
eg 0-1000 are WHO SDMX-HD identifiers, 1000-2000 are national
identifiers and identifiers 10000 and higher are locally assigned.
There are a few ways this could work but we need to solve it and
stabilize somehow.

(iii) related to above, when data is imported it should be possible
for it to indicate the namespace of its metadata codes - rather than
oblige the import of metadata with data.

With such stability, offline data entry is simply a matter of creating
custom xml datavalueset authoring clients which can post to a http(s)
url. These can be html5, xforms, ooxml, odf, custom mobile, pyqt or
what ever. The format can always be transformed. Throwing around
different "technology" suggestions for offline clients is a fairly
pointless exercise. We know there are many possibilities. But none
of them work without stabilizing the metadata identifiers.

I suspect that should be our major design effort moving forward.

Bob

···

On 27 November 2010 15:44, Knut Staring <knutst@gmail.com> wrote:

On Sat, Nov 27, 2010 at 4:25 PM, Ola Hodne Titlestad <olati@ifi.uio.no> wrote:

On 27 November 2010 16:13, Knut Staring <knutst@gmail.com> wrote:

On Sat, Nov 27, 2010 at 3:59 PM, Jo Størset <storset@gmail.com> wrote:
>
> Den 27. nov. 2010 kl. 19.17 skrev Romain-Rolland TOHOURI:
>
>> Hi Ola,
>> there is also Google gear technology that is based on javascrip and
>> allows to create an offline version of a web application that is sync
>> automatically with the server version when Internet connexion is available.
>> There is also some tools to convert java gear javascript code...
>> Hope this can help,
>
> Actually Gears was deprecated in favour of html5 some time ago...

It is interesting to look at HTML5 for offline storage and synching,
however, it seems to be still early days and need for standardization.
This article spells out the current status (as of half a year ago) and
mentions options like serializing JSON with PersistJS:
unspace.ca

XSLTforms is another solution in this space:
http://www.agencexml.com/xsltforms

On this topic, SitePen is creating all the XML goodies like Xpath etc for JSON:

Persevere and Pintura provide offline, local storage, server-synching,
RESTful, JSON database solution with JSONPath/JSONQuery support,
avoiding to manually comb through stuff. JSONSchema gives an Object
model of your data.

http://www.persvr.org/Page/Persevere
Development and JavaScript Blog | SitePen

Knut

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

Lars · 28 November 2010 18:34

Currently it is possible to feed any type of xml based data into dhis

via import/export module. Currently configured is dxf import and

sdmx-hd in 2.0.5 but others can be implemented on the fly. But there

are still 3 areas to be addressed to make this process “better”:

(i) authentication - currently cookies/html form based which is

awkward. Watching what Jo is doing with basic auth with interest.

The basic auth approach is probably going to be the simplest workable

solution here and we will leverage on what the mobile folk are doing

here.

(ii) management of identifiers/metadata. You can’t have reliable

off-site 3rd party client software producing datavaluesets without a

robust exchange of metadata identifiers and codelists. My feeling on

this, which has grown over the past year and a half, is in fact to

stabilize the integer identifiers within a country context. So within

a domain of say Kenya or Tanzania or Sierra Leone, the dataelement

“New malaria cases” should have a fixed identitier (say 42, the

meaning of life). I think the main implication of this is might be

either to do away completely with auto generated ids as primary key or

to also persist an authoritative id - you shouldn’t be able to create

a new identifiable object (dataelement, indicator, category or what

have you) without assigning it an id. And to do that you must

assume some sort of authority over that metadata element - ie

ownership. In an environment where you might want to create

dataelements from lower down the hierarchy then you either also have

to have an “authority” identifier, or we need to partition the range -

eg 0-1000 are WHO SDMX-HD identifiers, 1000-2000 are national

identifiers and identifiers 10000 and higher are locally assigned.

There are a few ways this could work but we need to solve it and

stabilize somehow.

(iii) related to above, when data is imported it should be possible

for it to indicate the namespace of its metadata codes - rather than

oblige the import of metadata with data.

Agree with all of this. Its planned for and will be addressed in 2.0.7.

With such stability, offline data entry is simply a matter of creating

custom xml datavalueset authoring clients which can post to a http(s)

url. These can be html5, xforms, ooxml, odf, custom mobile, pyqt or

what ever. The format can always be transformed. Throwing around

different “technology” suggestions for offline clients is a fairly

pointless exercise. We know there are many possibilities. But none

of them work without stabilizing the metadata identifiers.

Yes. I think there is a slight difference between offline data entry (where data is entered into one of the containers you mention and sent manually) and what Ola means with “semi-online” data entry. In the latter the user will open the data entry screen as usual. Then, we want to be protected from network downtime by letting the user continue to enter data into a local storage, and when the network is back up the local data can be flushed and submitted to the online server. The web storage functionality in HTML 5 provides a suitable lightweight key-value storage mechanism and an API for determining online/offline status in this regard. We would in any case wait for this to mature before implementing anything based on it.

http://dev.w3.org/html5/webstorage/

···

On Sat, Nov 27, 2010 at 6:03 PM, Bob Jolliffe bobjolliffe@gmail.com wrote:

bobj · 28 November 2010 22:41

Currently it is possible to feed any type of xml based data into dhis
via import/export module. Currently configured is dxf import and
sdmx-hd in 2.0.5 but others can be implemented on the fly. But there
are still 3 areas to be addressed to make this process "better":

(i) authentication - currently cookies/html form based which is
awkward. Watching what Jo is doing with basic auth with interest.
The basic auth approach is probably going to be the simplest workable
solution here and we will leverage on what the mobile folk are doing
here.

(ii) management of identifiers/metadata. You can't have reliable
off-site 3rd party client software producing datavaluesets without a
robust exchange of metadata identifiers and codelists. My feeling on
this, which has grown over the past year and a half, is in fact to
stabilize the integer identifiers within a country context. So within
a domain of say Kenya or Tanzania or Sierra Leone, the dataelement
"New malaria cases" should have a fixed identitier (say 42, the
meaning of life). I think the main implication of this is might be
either to do away completely with auto generated ids as primary key or
to also persist an authoritative id - you shouldn't be able to create
a new identifiable object (dataelement, indicator, category or what
have you) without *assigning* it an id. And to do that you must
assume some sort of authority over that metadata element - ie
ownership. In an environment where you might want to create
dataelements from lower down the hierarchy then you either also have
to have an "authority" identifier, or we need to partition the range -
eg 0-1000 are WHO SDMX-HD identifiers, 1000-2000 are national
identifiers and identifiers 10000 and higher are locally assigned.
There are a few ways this could work but we need to solve it and
stabilize somehow.

(iii) related to above, when data is imported it should be possible
for it to indicate the namespace of its metadata codes - rather than
oblige the import of metadata with data.

Agree with all of this. Its planned for and will be addressed in 2.0.7.

With such stability, offline data entry is simply a matter of creating
custom xml datavalueset authoring clients which can post to a http(s)
url. These can be html5, xforms, ooxml, odf, custom mobile, pyqt or
what ever. The format can always be transformed. Throwing around
different "technology" suggestions for offline clients is a fairly
pointless exercise. We know there are many possibilities. But none
of them work without stabilizing the metadata identifiers.

Yes. I think there is a slight difference between offline data entry (where
data is entered into one of the containers you mention and sent manually)
and what Ola means with "semi-online" data entry. In the latter the user
will open the data entry screen as usual. Then, we want to be protected from
network downtime by letting the user continue to enter data into a local
storage, and when the network is back up the local data can be flushed and
submitted to the online server. The web storage functionality in HTML 5
provides a suitable lightweight key-value storage mechanism and an API for
determining online/offline status in this regard. We would in any case wait
for this to mature before implementing anything based on it.
http://dev.w3.org/html5/webstorage/

OK. True. I was still thinking in terms of the xforms scenario at
top of thread.

···

2010/11/28 Lars Helge Øverland <larshelge@gmail.com>:

On Sat, Nov 27, 2010 at 6:03 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

Jo_Storset · 7 December 2010 12:27

Hi,

seeing if we can keep this thread alive..

(i) authentication - currently cookies/html form based which is
awkward. Watching what Jo is doing with basic auth with interest.
The basic auth approach is probably going to be the simplest workable
solution here and we will leverage on what the mobile folk are doing
here.

I don't think authentication in itself is any issue for this, it is more a general issue (And a big issue, at that :).
The way dhis security is of today, there is no reason not to just use basic..

(ii) management of identifiers/metadata. You can't have reliable
off-site 3rd party client software producing datavaluesets without a
robust exchange of metadata identifiers and codelists. My feeling on
this, which has grown over the past year and a half, is in fact to
stabilize the integer identifiers within a country context. So within
a domain of say Kenya or Tanzania or Sierra Leone, the dataelement
"New malaria cases" should have a fixed identitier (say 42, the
meaning of life). I think the main implication of this is might be
either to do away completely with auto generated ids as primary key or
to also persist an authoritative id - you shouldn't be able to create
a new identifiable object (dataelement, indicator, category or what
have you) without *assigning* it an id. And to do that you must
assume some sort of authority over that metadata element - ie
ownership. In an environment where you might want to create
dataelements from lower down the hierarchy then you either also have
to have an "authority" identifier, or we need to partition the range -
eg 0-1000 are WHO SDMX-HD identifiers, 1000-2000 are national
identifiers and identifiers 10000 and higher are locally assigned.
There are a few ways this could work but we need to solve it and
stabilize somehow.

If I understand you correctly, I agree with your technical points, we should assign an id to the entities (and an "authority" should be part of the global id?).

But we have to be careful that we don't end up just working where such assignment is done "right" and changes to the entities are more or less not allowed (or very controlled). To handle that, there is one thing I think are missing from the list, versioning. I don't immediately see anything in 2.0.7 indicating that we are thinking about versioning (maybe you are hoping to solve it by just requiring "stable" metadata?), so I thought I'd bring it up.

(iv) Versioning of metadata

Any kind of client that can be offline (be it semi-online web client or any other system) will mean that changes might happen that needs to be captured in some kind of consistent way in dhis, and that version conflicts needs to have some way to be resolved. To be able to communicate changes externally and deal with the differences between the distributed models when communicating, I think we more or less have three ways to look:

1. Build versioning into the core domain
2. Build a separate api/model for import-export/whatever we call it that handles versioning in some fashion (but don't really know what that would look like)
3. "Declarative" versioning/manual version conflict resolution for api's talking to external systems (sort of like the current sms handling, and adding some way to declare changes).

Versioning potentially adds a lot of complexity, so I am a bit vary of bringing it up. But I don't think handling versioning individually for each external api is the way to go and I am pretty certain that trying to pipe everything through a high level impexp mechanism is not going to work, either, as more external interactions get more granular. I'm not sure if light client interaction (think handling many concurrent small interactions) and big impexp kind of stuff can easily share solutions, but we need to think through some of the the possibilities.

Jo

···

Den 29. nov. 2010 kl. 04.11 skrev Bob Jolliffe:

2010/11/28 Lars Helge Øverland <larshelge@gmail.com>:

On Sat, Nov 27, 2010 at 6:03 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

bobj · 7 December 2010 12:51

Hi,

seeing if we can keep this thread alive..

(i) authentication - currently cookies/html form based which is
awkward. Watching what Jo is doing with basic auth with interest.
The basic auth approach is probably going to be the simplest workable
solution here and we will leverage on what the mobile folk are doing
here.

I don't think authentication in itself is any issue for this, it is more a general issue (And a big issue, at that :).
The way dhis security is of today, there is no reason not to just use basic..

(ii) management of identifiers/metadata. You can't have reliable
off-site 3rd party client software producing datavaluesets without a
robust exchange of metadata identifiers and codelists. My feeling on
this, which has grown over the past year and a half, is in fact to
stabilize the integer identifiers within a country context. So within
a domain of say Kenya or Tanzania or Sierra Leone, the dataelement
"New malaria cases" should have a fixed identitier (say 42, the
meaning of life). I think the main implication of this is might be
either to do away completely with auto generated ids as primary key or
to also persist an authoritative id - you shouldn't be able to create
a new identifiable object (dataelement, indicator, category or what
have you) without *assigning* it an id. And to do that you must
assume some sort of authority over that metadata element - ie
ownership. In an environment where you might want to create
dataelements from lower down the hierarchy then you either also have
to have an "authority" identifier, or we need to partition the range -
eg 0-1000 are WHO SDMX-HD identifiers, 1000-2000 are national
identifiers and identifiers 10000 and higher are locally assigned.
There are a few ways this could work but we need to solve it and
stabilize somehow.

If I understand you correctly, I agree with your technical points, we should assign an id to the entities (and an "authority" should be part of the global id?).

But we have to be careful that we don't end up just working where such assignment is done "right" and changes to the entities are more or less not allowed (or very controlled). To handle that, there is one thing I think are missing from the list, versioning. I don't immediately see anything in 2.0.7 indicating that we are thinking about versioning (maybe you are hoping to solve it by just requiring "stable" metadata?), so I thought I'd bring it up.

(iv) Versioning of metadata

To me versioning is not different to stable. ie if a metadata item
has an id of 42, an owner of "Kenya National MOH" and a version of
2.4, then I (ideally) expect that id to be constant in perpetuity for
that version.

···

On 7 December 2010 12:27, Jo Størset <storset@gmail.com> wrote:

Den 29. nov. 2010 kl. 04.11 skrev Bob Jolliffe:

2010/11/28 Lars Helge Øverland <larshelge@gmail.com>:

On Sat, Nov 27, 2010 at 6:03 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

Any kind of client that can be offline (be it semi-online web client or any other system) will mean that changes might happen that needs to be captured in some kind of consistent way in dhis, and that version conflicts needs to have some way to be resolved. To be able to communicate changes externally and deal with the differences between the distributed models when communicating, I think we more or less have three ways to look:

1. Build versioning into the core domain
2. Build a separate api/model for import-export/whatever we call it that handles versioning in some fashion (but don't really know what that would look like)
3. "Declarative" versioning/manual version conflict resolution for api's talking to external systems (sort of like the current sms handling, and adding some way to declare changes).

Versioning potentially adds a lot of complexity, so I am a bit vary of bringing it up. But I don't think handling versioning individually for each external api is the way to go and I am pretty certain that trying to pipe everything through a high level impexp mechanism is not going to work, either, as more external interactions get more granular. I'm not sure if light client interaction (think handling many concurrent small interactions) and big impexp kind of stuff can easily share solutions, but we need to think through some of the the possibilities.

Jo

Jo_Storset · 9 December 2010 11:40

I agree (of course), but I'm not sure I understand the implications.

As far as I can tell versioning wasn't in your discussion on identifiers, do you think you want it there? If versioning is going to be part of the id regime, like in the above example, it would also have to be part of the discussion, somehow.. If it's not going to be there, I think it would be good to at least have some discussions on what to do for the most immediate use cases that would need some way to coordinate metadata changes. A third option would of course be that it is too ambitious for us to try to coordinate on versioning issues for the time being, but then I think it would be good if we could managed to at least explicitly agree on that.

The main reason for my interest is that on the mobile side, we're going to have to come up with some solution for versioning. Just as I guess you already have had to make some sort of versioning for import-export use cases. And, as Olas original mail pointed to, there is a potential use case for better support for syncing dhis instances.

Should we just think of versioning as a specific use case issue for the time being, or would we be able to take some small steps to at least avoid unnecessary fragmentation? I guess both fragmentation and coordination have their built-in challenges, I gather that the silence on the topic maybe should be taken as a signal that we should go with the first option for the time being?

Jo

···

Den 7. des. 2010 kl. 18.21 skrev Bob Jolliffe:

To me versioning is not different to stable. ie if a metadata item
has an id of 42, an owner of "Kenya National MOH" and a version of
2.4, then I (ideally) expect that id to be constant in perpetuity for
that version.

bobj · 9 December 2010 12:29

To me versioning is not different to stable. ie if a metadata item
has an id of 42, an owner of "Kenya National MOH" and a version of
2.4, then I (ideally) expect that id to be constant in perpetuity for
that version.

I agree (of course), but I'm not sure I understand the implications.

As far as I can tell versioning wasn't in your discussion on identifiers,

Wasn't a complete discussion so much as a response

do you think you want it there?

It has to be there in one form or another. The fact is that a client
will make use of a set of identifiers. And they will change from time
to time. I don't want to argue that they can never change. Of course
they will. But if and when they do, we will be talking about
different versions. The question is whether there is sufficient use
case to acknowledge that explicitly - to take the bull by the horns as
it were - or whether it is sufficient to manage the process of getting
everyting back in synch in a manual, flexible ad-hoc manner.

I think the problem with this latter approach is that on the surface
it appears least complex.

But beneath the surface users will be (and are) managing versions
anyway without any system support to do so. Something which is easier
or harder depending on (i) how closed the system is and (ii) how big
the system is and (iii) how far forward in time metadata governance
proceeds. And (hopefully) increasingly on the amount of 3rd party
collaboration/interoperability going on. I suspect that we find it is
convenient to go into new environments and set up pilot projects and
the like if we can temporarily put aside complex concerns like
metadata versioning. A sort of suspension of disbelief which allows
things to get done. But in time these systems must get harder to
maintain. Probably some battle stories from the field would shed some
light on this - particularly on dhis deployments which have survived
over some time (India might be interesting here) as well as
deployments which have fallen apart or collapsed into disuse.

I suspect the sweet spot is to have a rigorously correct system
capacity to maintain complex metadatasets over time (with all the
implications of potentially complex metadata governance user roles) as
well as the ability to ignore or override them, particularly in
situations where there is rapid and frequent change like you wouild
likely find in the early life of projects.

On sdmx side we sidestepped the issues slightly by using an external
dxf metadata dump as a canonical reference to be shared with 3rd party
clients (transformed to sdmx of course, but that is incidental). This
temporarily solves the issue of the lack of stability of integer ids
in the database but doesn't resolve the version problem. A solution I
have toyed with is to stamp a version number on the entire dxf
metadata dump. That gives at least coarse grained versioning. But is
also draconian in its way - for example you cannot change a single
orgunit or dataelement or category in the entire set without
triggering an overall version update.

One way to think about it might be to consider different kinds of
change differently. eg. some additive changes (new dataelement),
changes in textual identifiers (spelling correction of facility name)
etc need not completely invalidate existing metadata that a client
might have. The kinds of change which are most cataclysmic would be
the ones where for example existing facilities acquire new integer
ids. And this is in fact the most common kind of change that one
might see given the way we assign those ids currently - effectively
arbitrarily using whatever particular db sequence generators give us.
It might well be that as we move to stabilize those integer
identifiers that we actually end up reducing these kinds of breaking
changes significantly and simplify the versioning requirements in the
process. Not sure .. i'm really trying not to think too much about
this until January.

I think what you are doing (and not to dissimilar to what I have done)
is to temporarily suspend disbelief and work with the integer
identifiers in the hope (and reasonable expectation in some
circumstances) that they won't get shuffled on you At least not
without warning. I think that will work for a short while and will
drive the effort to stabilize those in 2.0.7. And I think we will
need to build in some concept of versioning to this, but hopefully not
as draconian or difficult to work with as you are fearing.

Regards
Bob

···

On 9 December 2010 11:40, Jo Størset <storset@gmail.com> wrote:

Den 7. des. 2010 kl. 18.21 skrev Bob Jolliffe:

If versioning is going to be part of the id regime, like in the above example, it would also have to be part of the discussion, somehow.. If it's not going to be there, I think it would be good to at least have some discussions on what to do for the most immediate use cases that would need some way to coordinate metadata changes. A third option would of course be that it is too ambitious for us to try to coordinate on versioning issues for the time being, but then I think it would be good if we could managed to at least explicitly agree on that.

The main reason for my interest is that on the mobile side, we're going to have to come up with some solution for versioning. Just as I guess you already have had to make some sort of versioning for import-export use cases. And, as Olas original mail pointed to, there is a potential use case for better support for syncing dhis instances.

Should we just think of versioning as a specific use case issue for the time being, or would we be able to take some small steps to at least avoid unnecessary fragmentation? I guess both fragmentation and coordination have their built-in challenges, I gather that the silence on the topic maybe should be taken as a signal that we should go with the first option for the time being?

Jo

Lars · 9 December 2010 18:25

Just a quick battlefield story from what we experience in Sierra Leone where we are facing insufficient internet connectivity and must resort to offline deployments. The districts are quite rural and we experience an “offset” when distributing new databases (metadatasets), which means that for each update there will be districts that still exports data to the national server using the previous metadataset for some time until the new set has percolated completely.

Like Bob says we deal with different types of changes, the ones I can think of are

Adding of elements (dataelements, orgunits) (in the national database)
Removal of elements (in the national database)
Updates of element names/properties (in the national database)

The point of departure is that the current solution where an export message contains both data and metadata and we match on the display name of elements is not appropriate, and that we will move to using dedicated, agreed and stable metadata identifiers of some sort. As Jo says we are not planning for versioning in 2.0.7.

For the current solution the implication of the mentioned changes when importing data from the out-of-date districts into the national database for the change types respectively are as follows:

No data will come in. No problem.
Old metadata elements will continue to come in. Minor problem.
Metadata elements and its data which represents the same element will come in but under a different name. Big problem and source for complete chaos.

With the new proposed solution with stable metadata identifiers the situations is like this when receiving data from the out-of-date districts:

No data will come in. No problem. As before.
Data will be ignored as the metadata identifier will not match anything in the national database. No problem.
Data will match on the stable identifier even if the name/properties are changed. No problem.

So in the offline deployment of the “regular” DHIS scenario this move will be a huge improvement. It might not be perfect and there might be situations I have not thought about but it will help to alleviate the bigger problems.

I am not sure how relevant this is in Jo’s mobile scenario.

Lars

PS. That said I am not sure how appropriate the versioning paradigm is in this scenario. What if you want to change the name or add an element in the national database, should we then deny all out-of-date districts to report any data at all? Can we live well with what is described above?

bobj · 9 December 2010 19:13

Just a quick battlefield story from what we experience in Sierra Leone where
we are facing insufficient internet connectivity and must resort to offline
deployments. The districts are quite rural and we experience an "offset"
when distributing new databases (metadatasets), which means that for each
update there will be districts that still exports data to the national
server using the previous metadataset for some time until the new set has
percolated completely.
Like Bob says we deal with different types of changes, the ones I can think
of are
1) Adding of elements (dataelements, orgunits) (in the national database)
2) Removal of elements (in the national database)
3) Updates of element names/properties (in the national database)
The point of departure is that the current solution where an export message
contains both data and metadata and we match on the display name of elements
is not appropriate, and that we will move to using dedicated, agreed and
stable metadata identifiers of some sort. As Jo says we are not planning for
versioning in 2.0.7.
For the current solution the implication of the mentioned changes
when importing data from the out-of-date districts into the national
database for the change types respectively are as follows:
1) No data will come in. No problem.
2) Old metadata elements will continue to come in. Minor problem.
3) Metadata elements and its data which represents the same element will
come in but under a different name. Big problem and source for complete
chaos.
With the new proposed solution with stable metadata identifiers the
situations is like this when receiving data from the out-of-date districts:
1) No data will come in. No problem. As before.
2) Data will be ignored as the metadata identifier will not match anything
in the national database. No problem.
3) Data will match on the stable identifier even if the name/properties are
changed. No problem.
So in the offline deployment of the "regular" DHIS scenario this move will
be a huge improvement.

Very much agree. But if we are to have stable identifiers then they
will have to be distributed somehow as a metadata set. At the very
least that might have a timestamp, eg
<dxf><metadata published="2011-01-01" publisher="SL MOHS"><de's,
mcdonalds, orgunits etc ..></metadata></dxf>
And Jo this could be a fine grained transaction (even jason encoded!)
rather than a full metadata set. eg metadata for a particular dataset
or orgunit. Either way its not harmful to have version attribute or
publication date nor does it necessarily have to be too complicated.

On reading data into say national db, it might look like
<dxf><data metadataversion="2010-11-01" metadatapublisher="SL MOHS"><.
..datavalues .../></data></dxf>

Note that the metadata is outdated. The simplest case on the national
side is that it simply ignores the metadata attributes. That would
essentially be the improved situation that Lars describes above so
nothing is lost. A slightly more complex response would be to trigger
a metadata preview which shows a reasonably useful diff of the the two
versions which might indicate if anything "bad" can happen or whether
any data will be lost. Following which the user can decide how to
continue: to import or not or perform some kind of translation. These
different kinds of responses can be controlled with a system setting.
At the very least we will have useful information that the sending
system still needs to update its metadata.

Implementing could be fairly simple. I'm thinking, for example, that
every time national MOH "publishes" (ie agrees on a set of metadata to
be sent out to districts, provinces and all) it retains a copy of that
published metadata (export-2010-01-01.xml). That way the system has
some memory of what is out there in the darkness. That's the most
basic level of versioning I can foresee. Though it could start to
become wild if metadata is published every other day. This I imagine
is most typical of startup scenarios and needs to be catered for.

Got to go. Catch up on this later.

Cheers
Bob

···

On 9 December 2010 18:25, Lars Helge Øverland <larshelge@gmail.com> wrote:

It might not be perfect and there might be situations
I have not thought about but it will help to alleviate the bigger problems.
I am not sure how relevant this is in Jo's mobile scenario.
Lars
PS. That said I am not sure how appropriate the versioning paradigm is in
this scenario. What if you want to change the name or add an element in the
national database, should we then deny all out-of-date districts to report
any data at all? Can we live well with what is described above?

Jo_Storset · 9 December 2010 20:00

I am not sure how relevant this is in Jo's mobile scenario.

Not very similar, but nevertheless relevant to get an idea of the scope we are targeting.

For mobile (and probably also other integration and "light client" use cases) we generally need versioning for reasons like

1) not working on the full dhis "metadata set" on the other side (so versioning by "database upgrade" is not an option),
2) limited bandwith (and storage, the nature of the clients) means that we want to limit what we have to send to the client to a minimum, and
3) the sheer number of "uncontrolled" clients means that we cannot really rely on controlled, manual client upgrading (if we can avoid it).

PS. That said I am not sure how appropriate the versioning paradigm is in this scenario. What if you want to change the name or add an element in the national database, should we then deny all out-of-date districts to report any data at all? Can we live well with what is described above?

That's one of the reasons I would like to have a bit more high level discussion about versioning before going further. But I don't really think this example is a good way of framing it. There is really nothing that says that fixing a typo in a name has to represent a new version, or that versions cannot be marked in different ways (such as "extending" or "cosmetic"). There is also nothing that says you have to throw the data away because you have versioning, versioning can actually help you do the opposite So to me it is more a case of, as Bob said, *how* we want versioning to be done rather that if.. And I'm pretty sure we'll find more and more use cases where versioning globally will not be enough, as we go along..

I'm adding an example of how mobile versioning might be implemented in a "manual" fashion (it has not really been worked through, though). For mobile we "cosmetic" fixes that are important to notify to the client (ordering, section changes, names) but would not mean incompatible changes to the data values being edited or sent. It might be that these differences between compatible and incompatible changes are more general? (This will probably only be an option where decent gprs is realtively available and stable, btw)

Server side configuration:

- We make a serverside management page that lists all data sets with an option to "publish" them for mobile reporting
- published datasets can be manually tagged as "extended", "incompatible change" or "removed"
- when saving this configuration, we timestamp (version) all changes made, and also save a timestamp/version for the mobile set as a whole.

(Extended might be tricky, depending on how complex it is to change how the client works, might only be a "texts changed" flag for now. Ideally it would mean that the client is capable of reorganizing stored dataValueSets on the phone to allow for changes to the order of element and sections and also added elements in the form that is not present in the stored datavaluesets.)

Client side updating:

- When downloading forms, we send the forms published in this console, adding the the timestamps to datasets downloaded (and also the timestamp for the whole set)
- When starting the app on the mobile, we try to make a request to the server for changes (if it fails, ignore it).
- if something has changed, we download the changes
- if change is extended, or removed, no problem
- if change is incompatible, remove the dataset and any locally saved data (warning the user in that case)

Sever side "report" recieving:

- data recieved from client will be checked against the current version
- if it's the same version or there is not an incompatible change between the versions -> store
- if there is an incompatible change store in a queue for admin evaluation (much like is implemented for sms now)

....

Jo

···

Den 9. des. 2010 kl. 23.55 skrev Lars Helge Øverland:

Lars · 9 December 2010 22:15

I assume that if the dataset is incompatible it contains elements which are not mappable to current elements (not collected anymore? change in breakdown?) Just out of curiosity, do we have specific use-cases where we are interested in collecting data which is not mappable to the current metadata?

···

On Thu, Dec 9, 2010 at 9:00 PM, Jo Størset storset@gmail.com wrote:

Den 9. des. 2010 kl. 23.55 skrev Lars Helge Øverland:

I am not sure how relevant this is in Jo’s mobile scenario.

Not very similar, but nevertheless relevant to get an idea of the scope we are targeting.

For mobile (and probably also other integration and “light client” use cases) we generally need versioning for reasons like

not working on the full dhis “metadata set” on the other side (so versioning by “database upgrade” is not an option),

limited bandwith (and storage, the nature of the clients) means that we want to limit what we have to send to the client to a minimum, and

the sheer number of “uncontrolled” clients means that we cannot really rely on controlled, manual client upgrading (if we can avoid it).

PS. That said I am not sure how appropriate the versioning paradigm is in this scenario. What if you want to change the name or add an element in the national database, should we then deny all out-of-date districts to report any data at all? Can we live well with what is described above?

That’s one of the reasons I would like to have a bit more high level discussion about versioning before going further. But I don’t really think this example is a good way of framing it. There is really nothing that says that fixing a typo in a name has to represent a new version, or that versions cannot be marked in different ways (such as “extending” or “cosmetic”). There is also nothing that says you have to throw the data away because you have versioning, versioning can actually help you do the opposite So to me it is more a case of, as Bob said, how we want versioning to be done rather that if… And I’m pretty sure we’ll find more and more use cases where versioning globally will not be enough, as we go along…

I’m adding an example of how mobile versioning might be implemented in a “manual” fashion (it has not really been worked through, though). For mobile we “cosmetic” fixes that are important to notify to the client (ordering, section changes, names) but would not mean incompatible changes to the data values being edited or sent. It might be that these differences between compatible and incompatible changes are more general? (This will probably only be an option where decent gprs is realtively available and stable, btw)

Server side configuration:

We make a serverside management page that lists all data sets with an option to “publish” them for mobile reporting

published datasets can be manually tagged as “extended”, “incompatible change” or “removed”

when saving this configuration, we timestamp (version) all changes made, and also save a timestamp/version for the mobile set as a whole.

(Extended might be tricky, depending on how complex it is to change how the client works, might only be a “texts changed” flag for now. Ideally it would mean that the client is capable of reorganizing stored dataValueSets on the phone to allow for changes to the order of element and sections and also added elements in the form that is not present in the stored datavaluesets.)

Client side updating:

When downloading forms, we send the forms published in this console, adding the the timestamps to datasets downloaded (and also the timestamp for the whole set)

When starting the app on the mobile, we try to make a request to the server for changes (if it fails, ignore it).

if something has changed, we download the changes

if change is extended, or removed, no problem

if change is incompatible, remove the dataset and any locally saved data (warning the user in that case)

Sever side “report” recieving:

data recieved from client will be checked against the current version

if it’s the same version or there is not an incompatible change between the versions → store

if there is an incompatible change store in a queue for admin evaluation (much like is implemented for sms now)

…

Jo

Jo_Storset · 10 December 2010 03:46

I don’t know that much about the implementation side of things, so others would have to answer that one.

But, say you want to change from one month to the next which dataelements is required (we don’t have that functionality for datasets [yet], but we do in the community module). Or that you want to change the type of a dataelement, or it’s macdonalds stuff? Or that you just want changes to be consistent across the solution.

Since this is not about whole instances having to be manually updated, automating the workflows here would seem possible and very much desirable. The main problem is handling edge cases where the user have edited a data set before it has gotten the update from the server, and those I would think we could build in configurable support for as we go along.

I the general case it seems to me that, at least in India, change is needed from time to time (especially for new functionality being deployed), and I think it is very much an implementation issue how you need to deal with it. I guess you can always manage without incompatible changes in principle, but I’m not sure it is wise to have it as a requirement?

Also note that for mobile (for now, at least), orgunit changes would be a manual issue to handle, as the app is (for now, at least) tied to the orgunit the user is reporting for. Changing orgunit is more complex, because what dataset’s (and other) to be reported can be different from orgunit to orgunit, and a more general solution would require a more complex ui and more memory and bandwith…

Jo

···

Den 10. des. 2010 kl. 03.45 skrev Lars Helge Øverland:

Sever side “report” recieving:

if there is an incompatible change store in a queue for admin evaluation (much like is implemented for sms now)

I assume that if the dataset is incompatible it contains elements which are not mappable to current elements (not collected anymore? change in breakdown?) Just out of curiosity, do we have specific use-cases where we are interested in collecting data which is not mappable to the current metadata?

Lars · 10 December 2010 13:48

OK. Having a kind of light-weight version management where you have the possibility to tell the client (whether mobile phone or dhis offline instance) that your metadata is outdated and issue a warning sounds very sensible. And that might be achieved without adding too much complexity.

I think this illustrates the differences between mobile and “standard” dhis offline deployment. In the latter the metadata changes are mostly i) done yearly and ii) incompatible, meaning that the database must be upgraded by an expert and not by an automated routine, also, flushing all data is not an option. And there is no easy way of conveying to the client that the metadata is outdated since it is not connected to the network.

Lars

···

On Fri, Dec 10, 2010 at 4:46 AM, Jo Størset storset@gmail.com wrote:

Den 10. des. 2010 kl. 03.45 skrev Lars Helge Øverland:

Sever side “report” recieving:

if there is an incompatible change store in a queue for admin evaluation (much like is implemented for sms now)

I assume that if the dataset is incompatible it contains elements which are not mappable to current elements (not collected anymore? change in breakdown?) Just out of curiosity, do we have specific use-cases where we are interested in collecting data which is not mappable to the current metadata?

I don’t know that much about the implementation side of things, so others would have to answer that one.

But, say you want to change from one month to the next which dataelements is required (we don’t have that functionality for datasets [yet], but we do in the community module). Or that you want to change the type of a dataelement, or it’s macdonalds stuff? Or that you just want changes to be consistent across the solution.

Since this is not about whole instances having to be manually updated, automating the workflows here would seem possible and very much desirable. The main problem is handling edge cases where the user have edited a data set before it has gotten the update from the server, and those I would think we could build in configurable support for as we go along.

I the general case it seems to me that, at least in India, change is needed from time to time (especially for new functionality being deployed), and I think it is very much an implementation issue how you need to deal with it. I guess you can always manage without incompatible changes in principle, but I’m not sure it is wise to have it as a requirement?

Also note that for mobile (for now, at least), orgunit changes would be a manual issue to handle, as the app is (for now, at least) tied to the orgunit the user is reporting for. Changing orgunit is more complex, because what dataset’s (and other) to be reported can be different from orgunit to orgunit, and a more general solution would require a more complex ui and more memory and bandwith…

Jo