[Bug 1597724] [NEW] Import of GML file update lastupdated date of all objects with co-ordinates

Public bug reported:

Importing of a GML file updates the "lastupdated" date of all
organisationunit with GIS co-ordinates to the date of the import.

Not sure if this is by design but it creates a problem for facility
registries in that it appears that all organisationunits with co-
ordinates had some field updated every time a GML file is imported when
in fact only some of the objects had updates.

Ideally the import should only update new/real updates and for those
update the GIS co-ordinates instead of overwriting all co-ordinates and
seeing them as updates.

** Affects: dhis2
     Importance: Undecided
         Status: New

···

--
You received this bug notification because you are a member of DHIS 2
developers, which is subscribed to DHIS.
https://bugs.launchpad.net/bugs/1597724

Title:
  Import of GML file update lastupdated date of all objects with co-
  ordinates

Status in DHIS:
  New

Bug description:
  Importing of a GML file updates the "lastupdated" date of all
  organisationunit with GIS co-ordinates to the date of the import.

  Not sure if this is by design but it creates a problem for facility
  registries in that it appears that all organisationunits with co-
  ordinates had some field updated every time a GML file is imported
  when in fact only some of the objects had updates.

  Ideally the import should only update new/real updates and for those
  update the GIS co-ordinates instead of overwriting all co-ordinates
  and seeing them as updates.

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhis2/+bug/1597724/+subscriptions

This is most likely an (unintended) consequence of the somewhat
contrived method we're using to import coordinates from GML files.
Unfortunately it's most likely not an easy fix, but I'll look into it.

Does this issue have any immediate consequences for your
implementation(s)?

Thanks for reporting!

** Changed in: dhis2
       Status: New => Confirmed

** Changed in: dhis2
     Assignee: (unassigned) => Halvdan Hoem Grelland (halvdanhg)

···

--
You received this bug notification because you are a member of DHIS 2
developers, which is subscribed to DHIS.
https://bugs.launchpad.net/bugs/1597724

Title:
  Import of GML file update lastupdated date of all objects with co-
  ordinates

Status in DHIS:
  Confirmed

Bug description:
  Importing of a GML file updates the "lastupdated" date of all
  organisationunit with GIS co-ordinates to the date of the import.

  Not sure if this is by design but it creates a problem for facility
  registries in that it appears that all organisationunits with co-
  ordinates had some field updated every time a GML file is imported
  when in fact only some of the objects had updates.

  Ideally the import should only update new/real updates and for those
  update the GIS co-ordinates instead of overwriting all co-ordinates
  and seeing them as updates.

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhis2/+bug/1597724/+subscriptions

I've investigated and retract that this is a bug. I guess I
misunderstood at first, but reading your report again makes it clear
that you are actually supplying a GML file with coordinates for all
orgunits, including those which already have coordinates.

As you are, in fact, importing coordinates for all of these orgunits,
it's technically not wrong that the lastUpdated is also set to reflect
this. We don't really discern between old vs. new value of metadata when
updating on import.

If you wish to avoid this you can give the importer a GML file which
only contains the orgunits you actually wish to update.

Setting this one to won't fix.

** Changed in: dhis2
       Status: Confirmed => Invalid

···

--
You received this bug notification because you are a member of DHIS 2
developers, which is subscribed to DHIS.
https://bugs.launchpad.net/bugs/1597724

Title:
  Import of GML file update lastupdated date of all objects with co-
  ordinates

Status in DHIS:
  Invalid

Bug description:
  Importing of a GML file updates the "lastupdated" date of all
  organisationunit with GIS co-ordinates to the date of the import.

  Not sure if this is by design but it creates a problem for facility
  registries in that it appears that all organisationunits with co-
  ordinates had some field updated every time a GML file is imported
  when in fact only some of the objects had updates.

  Ideally the import should only update new/real updates and for those
  update the GIS co-ordinates instead of overwriting all co-ordinates
  and seeing them as updates.

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhis2/+bug/1597724/+subscriptions

Elmarie,

It is a weakness with ANY import process in DHIS2 that there is no
differentiation between actual updates - i.e. where an imported value is
different from the existing one - and all other imported data records that
match existing data records with identical values. All imported records
matched to existing records are regarded as "updates", which is only
differentiated from valid records with no match ("new" records). (valid as
in having a valid primary key etc).

DHIS 1.4 provides a far more fine-grained control over any import process,
because imported data records are automatically identified as
- New data records
- Existing data records with different value and where the lastupdated is
newer (i.e. source data value has been updated more recently than the
destination database value)
- Existing data records with different value and where the lastupdated is
older (i.e. the destination database value has been updated more recently
than the source database value)
- Existing data records with identical value (lastupdated value is in that
case not relevant).
During the import process, the user can review these different categories
of imported data and decide whether to accept or reject them. That is not
possible with the DHIS2 import model - everything in the imported file that
fits existing primary keys will be imported/updated automatically.

I'm not 100% sure why the core team opted for this very "basic" import
methodology, but I suspect it's largely related to the assumption that most
DHIS2 instances will be national single instances without much need for
import or export of data. I suspect that changing the import methodology to
give users the same high-granularity control as in 1.4 would require a
major effort, but diversifying the treatment of the lastupdated field only
should be a lot simpler - presumably.

Halvdan's suggestion that "if you wish to avoid this you can give the
importer a GML file which only contains the orgunits you actually wish to
update" is not a very practical option, regrettably. If, as in our typical
case, you have somewhere between 5,000 and 40,0000 coordinates with
additions and corrections taking place regularly through various external
spatial databases, your primary option will be to access the database
instance directly and run queries that will identify any new or updated
co-ordinates by comparing the current external data set with whatever is
currently stored in the instance. After identifying the sub-set of new or
updated co-ordinates that way, you can then generate a new GML file OR you
might rather use the same queries to update the organisationunit table
directly.

(Another more theoretical option will be that somebody keep track of all
such changes in those external spatial databases and GIS systems, but since
spatial data development processes typically incorporate a multitude of
public and private organisations, it's again not practical to keep track of
all the updates and additions they generate).

For now, I guess the most efficient method will be to drop using GML file
imports and instead update our PostgreSQL database instances directly with
ACTUAL updates. It's not a very attractive option because it reduces
security and increase the possibility of somebody making crippling updates
to the database. Alternatively we use the database queries to identify new
and actually updated coordinates, and then go through the process of
generating a GML file with only those records (the latter method is more
cumbersome, but it will work with Production instances where we have
disabled direct database access for security reasons).

I'll make sure we raise this issue again during our meetings in Oslo in
August. Halvdan might find himself voted down on this one, we'll see :wink:

Regards
Calle

···

On 30 June 2016 at 17:05, Halvdan Hoem Grelland <halvdan@dhis2.org> wrote:

I've investigated and retract that this is a bug. I guess I
misunderstood at first, but reading your report again makes it clear
that you are actually supplying a GML file with coordinates for all
orgunits, including those which already have coordinates.

As you are, in fact, importing coordinates for all of these orgunits,
it's technically not wrong that the lastUpdated is also set to reflect
this. We don't really discern between old vs. new value of metadata when
updating on import.

If you wish to avoid this you can give the importer a GML file which
only contains the orgunits you actually wish to update.

Setting this one to won't fix.

** Changed in: dhis2
       Status: Confirmed => Invalid

--
You received this bug notification because you are a member of DHIS 2
developers, which is subscribed to DHIS.
https://bugs.launchpad.net/bugs/1597724

Title:
  Import of GML file update lastupdated date of all objects with co-
  ordinates

Status in DHIS:
  Invalid

Bug description:
  Importing of a GML file updates the "lastupdated" date of all
  organisationunit with GIS co-ordinates to the date of the import.

  Not sure if this is by design but it creates a problem for facility
  registries in that it appears that all organisationunits with co-
  ordinates had some field updated every time a GML file is imported
  when in fact only some of the objects had updates.

  Ideally the import should only update new/real updates and for those
  update the GIS co-ordinates instead of overwriting all co-ordinates
  and seeing them as updates.

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhis2/+bug/1597724/+subscriptions

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--

*******************************************

Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19119

Email: calle.hedberg@gmail.com

Skype: calle_hedberg

*******************************************

--
You received this bug notification because you are a member of DHIS 2
developers, which is subscribed to DHIS.
https://bugs.launchpad.net/bugs/1597724

Title:
  Import of GML file update lastupdated date of all objects with co-
  ordinates

Status in DHIS:
  Invalid

Bug description:
  Importing of a GML file updates the "lastupdated" date of all
  organisationunit with GIS co-ordinates to the date of the import.

  Not sure if this is by design but it creates a problem for facility
  registries in that it appears that all organisationunits with co-
  ordinates had some field updated every time a GML file is imported
  when in fact only some of the objects had updates.

  Ideally the import should only update new/real updates and for those
  update the GIS co-ordinates instead of overwriting all co-ordinates
  and seeing them as updates.

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhis2/+bug/1597724/+subscriptions

Thanks Halvdan and Calle.

Elmarie

···

On 30 June 2016 at 17:05, Halvdan Hoem Grelland halvdan@dhis2.org wrote:

I’ve investigated and retract that this is a bug. I guess I

misunderstood at first, but reading your report again makes it clear

that you are actually supplying a GML file with coordinates for all

orgunits, including those which already have coordinates.

As you are, in fact, importing coordinates for all of these orgunits,

it’s technically not wrong that the lastUpdated is also set to reflect

this. We don’t really discern between old vs. new value of metadata when

updating on import.

If you wish to avoid this you can give the importer a GML file which

only contains the orgunits you actually wish to update.

Setting this one to won’t fix.

** Changed in: dhis2

   Status: Confirmed => Invalid

You received this bug notification because you are a member of DHIS 2

developers, which is subscribed to DHIS.

https://bugs.launchpad.net/bugs/1597724

Title:

Import of GML file update lastupdated date of all objects with co-

ordinates

Status in DHIS:

Invalid

Bug description:

Importing of a GML file updates the “lastupdated” date of all

organisationunit with GIS co-ordinates to the date of the import.

Not sure if this is by design but it creates a problem for facility

registries in that it appears that all organisationunits with co-

ordinates had some field updated every time a GML file is imported

when in fact only some of the objects had updates.

Ideally the import should only update new/real updates and for those

update the GIS co-ordinates instead of overwriting all co-ordinates

and seeing them as updates.

To manage notifications about this bug go to:

https://bugs.launchpad.net/dhis2/+bug/1597724/+subscriptions


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp


Calle Hedberg

46D Alma Road, 7700 Rosebank, SOUTH AFRICA

Tel/fax (home): +27-21-685-6472

Cell: +27-82-853-5352

Iridium SatPhone: +8816-315-19119

Email: calle.hedberg@gmail.com

Skype: calle_hedberg