Since this discussion moved from roadmap to import-export and then interchange formats… lemme add something to chew for on import-export.
Why aren’t we using a diff-based import-export?? Suppose I already have an exported file and the new export file only should contain the new values, we can generate only the changed values in the export process. Definitely reduces time/ file size for places where import-export will be used monthly/weekly.
Has anyone thought of object serialization?? Where we could serialize objects and move it to new places… A radical and stupid way, but may be useful for GIS import-export, where the dxf format may fall short on representing sharable geographic information. Disease surveillance on the CBHS may be one candidate…
···
Regards,
Saptarshi PURKAYASTHA
Director R & D, HISP India
Health Information Systems Programme
Why aren’t we using a diff-based import-export?? Suppose I already have an exported file and the new export file only should contain the new values, we can generate only the changed values in the export process. Definitely reduces time/ file size for places where import-export will be used monthly/weekly.
We could definitely do this by using the timeStamp property on DataValue and let the user specify a “since” date in the export GUI. But the user is already specifying which months to export data for, typically only the last month so I am not sure how much we would save on the file size…
···
Has anyone thought of object serialization?? Where we could serialize objects and move it to new places… A radical and stupid way, but may be useful for GIS import-export, where the dxf format may fall short on representing sharable geographic information. Disease surveillance on the CBHS may be one candidate…
We could definitely do this by using the timeStamp property on DataValue and let the user specify a “since” date in the export GUI. But the user is already specifying which months to export data for, typically only the last month so I am not sure how much we would save on the file size…
I was suggesting times when only few datavalues were changed in a given period. Say for the last year’s data, we changed only a few values and the last month’s data was not exported earlier. I would then need to export 13 month’s of data, whereas we can just export those diff datavalues (changes from last year and this month’s data)
On Thu, Apr 23, 2009 at 9:46 PM, Saptarshi Purkayastha sunbiz@gmail.com wrote:
We could definitely do this by using the timeStamp property on DataValue and let the user specify a “since” date in the export GUI. But the user is already specifying which months to export data for, typically only the last month so I am not sure how much we would save on the file size…
I was suggesting times when only few datavalues were changed in a given period. Say for the last year’s data, we changed only a few values and the last month’s data was not exported earlier. I would then need to export 13 month’s of data, whereas we can just export those diff datavalues (changes from last year and this month’s data)
Since this discussion moved from roadmap to import-export and then
interchange formats... lemme add something to chew for on import-export.
Why aren't we using a diff-based import-export?? Suppose I already have an
exported file and the new export file only should contain the new values, we
can generate only the changed values in the export process. Definitely
reduces time/ file size for places where import-export will be used
monthly/weekly.
Thinking out loud: the problem with a diff is that you need to have
something to diff against. So the program would have to be able to
read the exported file you already have in order to un-select these
values from the set in order to export. On the other hand it would be
quite trivial (and maybe in the end quicker) to have a dxf-diff
process which can generate the difference between two exported files.
Still its an interesting thought ...
Has anyone thought of object serialization?? Where we could serialize
objects and move it to new places... A radical and stupid way, but may be
useful for GIS import-export, where the dxf format may fall short on
representing sharable geographic information. Disease surveillance on the
CBHS may be one candidate...
Can of worms here - the respective merits of native serializing
(presumably) vs serializing to XML. Personally I would always go for
the latter.
I am not up to speed on exactly what shareable geographic information
we might want to share, but it makes sense to use an open geographical
markup language to do it (like GML) rather than trying to reinvent the
wheel in dxf. Spurious example:
Thats not a good example. Real life might be more interesting ... but
there is no good reason why we couldn't quite easily get geographic
information into dxf. The beauty of namespaces.
2009/4/23 Lars Helge Øverland <larshelge@gmail.com>:
>
>>
>> 6 complicates the current import strategy where objects get imported
>> "on
>> the fly" without temporary storage, maybe we can put this on hold.
>
> Actually this holds for 5 as well...
>
I'll have to look at this more closely - though I know you will have a
better idea of what is going on. In principle it shouldn't really
matter. All that's happening in 5 is that we import a <Period> object
(on the fly, in its entirety or what have you). Then we just use the
inherited periodID attribute as we import all the contained
dataValues. I am not seeing how this would be fundamentally
different. Could be I'm just tired and in need of a beer ...
“A data mesh allows information to be synchronized in a peer-to-peer way, allowing offline work, and synchronizing with whoever is available, not just a central database or a service on the internet. This makes it a perfect fit for situation where there is little/no connectivity or where the synchronization has to happen between different applications and services.”
I think Lars by saying “sure” means it is possible using dataValue timeStamp field. If you modified some data for last year (don’t know why? or this is allowed to do so after data have been reported?) in this month timeStamp will be the date of change ocured. So we need functionality which allows selecting certain date periods, like from this to that and so on.
mesh4x proposed by Knut seems quite valuable tool, need time to look at.
regards,
murod
···
On Thu, Apr 23, 2009 at 9:46 PM, Saptarshi Purkayastha sunbiz@gmail.com wrote:
We could definitely do this by using the timeStamp property on DataValue and let the user specify a “since” date in the export GUI. But the user is already specifying which months to export data for, typically only the last month so I am not sure how much we would save on the file size…
I was suggesting times when only few datavalues were changed in a given
period. Say for the last year’s data, we changed only a few values and
the last month’s data was not exported earlier. I would then need to
export 13 month’s of data, whereas we can just export those diff
datavalues (changes from last year and this month’s data)
Export of an old batch of data where some data have been changed is a very likely and “normal” event - also after the data has been reported/exported the first time. For example, errors have been detected, quality procedures have been put in place, many or few items may have changed - the file is exported to (typically) a higher level again.
Re-export of the same, but sligthly changed, data file is part of normal data quality procedures.
And (as Lars suggested) by using the lastUpdated flag on data values and a “include values changed since ” field in the export GUI we can create an export files containing all of last months data plus e.g. all values changed in the last year. This is how it’s done in 1.4.
On Thu, Apr 23, 2009 at 9:46 PM, Saptarshi Purkayastha sunbiz@gmail.com wrote:
We could definitely do this by using the timeStamp property on DataValue and let the user specify a “since” date in the export GUI. But the user is already specifying which months to export data for, typically only the last month so I am not sure how much we would save on the file size…
I was suggesting times when only few datavalues were changed in a given period. Say for the last year’s data, we changed only a few values and the last month’s data was not exported earlier. I would then need to export 13 month’s of data, whereas we can just export those diff datavalues (changes from last year and this month’s data)
Sure.
I think Lars by saying “sure” means it is possible using dataValue timeStamp field. If you modified some data for last year (don’t know why? or this is allowed to do so after data have been reported?) in this month timeStamp will be the date of change ocured. So we need functionality which allows selecting certain date periods, like from this to that and so on.
mesh4x proposed by Knut seems quite valuable tool, need time to look at.
A timestamp diff instead of file-based diff. Getting a new export out
and then doing a file diff with last export is a huge overhead.
As for gis import export, i suggested we add something 2 the
blueprint. Either in dxf or java object serialization. Both have some
merits/demerits wrt our prob. Oracle had that gis type in the db for
something. Those arguments are valid in our context.
···
On 4/24/09, Bob Jolliffe <bobjolliffe@gmail.com> wrote:
Since this discussion moved from roadmap to import-export and then
interchange formats... lemme add something to chew for on import-export.
Why aren't we using a diff-based import-export?? Suppose I already have
an
exported file and the new export file only should contain the new values,
we
can generate only the changed values in the export process. Definitely
reduces time/ file size for places where import-export will be used
monthly/weekly.
Thinking out loud: the problem with a diff is that you need to have
something to diff against. So the program would have to be able to
read the exported file you already have in order to un-select these
values from the set in order to export. On the other hand it would be
quite trivial (and maybe in the end quicker) to have a dxf-diff
process which can generate the difference between two exported files.
Still its an interesting thought ...
Has anyone thought of object serialization?? Where we could serialize
objects and move it to new places... A radical and stupid way, but may be
useful for GIS import-export, where the dxf format may fall short on
representing sharable geographic information. Disease surveillance on the
CBHS may be one candidate...
Can of worms here - the respective merits of native serializing
(presumably) vs serializing to XML. Personally I would always go for
the latter.
I am not up to speed on exactly what shareable geographic information
we might want to share, but it makes sense to use an open geographical
markup language to do it (like GML) rather than trying to reinvent the
wheel in dxf. Spurious example:
Thats not a good example. Real life might be more interesting ... but
there is no good reason why we couldn't quite easily get geographic
information into dxf. The beauty of namespaces.
Cheers
Bob
---
Regards,
Saptarshi PURKAYASTHA
Director R & D, HISP India
Health Information Systems Programme
2009/4/23 Lars Helge Øverland <larshelge@gmail.com>:
>
>>
>> 6 complicates the current import strategy where objects get imported
>> "on
>> the fly" without temporary storage, maybe we can put this on hold.
>
> Actually this holds for 5 as well...
>
I'll have to look at this more closely - though I know you will have a
better idea of what is going on. In principle it shouldn't really
matter. All that's happening in 5 is that we import a <Period> object
(on the fly, in its entirety or what have you). Then we just use the
inherited periodID attribute as we import all the contained
dataValues. I am not seeing how this would be fundamentally
different. Could be I'm just tired and in need of a beer ...
Cheers
Bob
--
Sent from my mobile device
---
Regards,
Saptarshi PURKAYASTHA
Director R & D, HISP India
Health Information Systems Programme
A timestamp diff instead of file-based diff. Getting a new export out
and then doing a file diff with last export is a huge overhead.
OK. So Lars' suggestion of using lastUpdated flag solves this problem?
As for gis import export, i suggested we add something 2 the
blueprint. Either in dxf or java object serialization. Both have some
merits/demerits wrt our prob. Oracle had that gis type in the db for
something. Those arguments are valid in our context.
OK. Add something to the blueprint and lets look at it. I notice the
mesh4x project which Knut has been getting charmed with is using KML.
As is OpenLayers. Probably would make sense to use that. Either way
we need to look first at what GIS data we would be
importing/exporting.
Since this discussion moved from roadmap to import-export and then
interchange formats... lemme add something to chew for on import-export.
Why aren't we using a diff-based import-export?? Suppose I already have
an
exported file and the new export file only should contain the new values,
we
can generate only the changed values in the export process. Definitely
reduces time/ file size for places where import-export will be used
monthly/weekly.
Thinking out loud: the problem with a diff is that you need to have
something to diff against. So the program would have to be able to
read the exported file you already have in order to un-select these
values from the set in order to export. On the other hand it would be
quite trivial (and maybe in the end quicker) to have a dxf-diff
process which can generate the difference between two exported files.
Still its an interesting thought ...
Has anyone thought of object serialization?? Where we could serialize
objects and move it to new places... A radical and stupid way, but may be
useful for GIS import-export, where the dxf format may fall short on
representing sharable geographic information. Disease surveillance on the
CBHS may be one candidate...
Can of worms here - the respective merits of native serializing
(presumably) vs serializing to XML. Personally I would always go for
the latter.
I am not up to speed on exactly what shareable geographic information
we might want to share, but it makes sense to use an open geographical
markup language to do it (like GML) rather than trying to reinvent the
wheel in dxf. Spurious example:
Thats not a good example. Real life might be more interesting ... but
there is no good reason why we couldn't quite easily get geographic
information into dxf. The beauty of namespaces.
Cheers
Bob
---
Regards,
Saptarshi PURKAYASTHA
Director R & D, HISP India
Health Information Systems Programme
2009/4/23 Lars Helge Øverland <larshelge@gmail.com>:
>
>>
>> 6 complicates the current import strategy where objects get imported
>> "on
>> the fly" without temporary storage, maybe we can put this on hold.
>
> Actually this holds for 5 as well...
>
I'll have to look at this more closely - though I know you will have a
better idea of what is going on. In principle it shouldn't really
matter. All that's happening in 5 is that we import a <Period> object
(on the fly, in its entirety or what have you). Then we just use the
inherited periodID attribute as we import all the contained
dataValues. I am not seeing how this would be fundamentally
different. Could be I'm just tired and in need of a beer ...
Cheers
Bob
--
Sent from my mobile device
---
Regards,
Saptarshi PURKAYASTHA
Director R & D, HISP India
Health Information Systems Programme
The new GIS solution for DHIS 2 is not using KML, but GeoJSON.
Conversion between these formats and PostGIS or Mysql/H2 spatial
tables is possible, but for now we will mainly be using JSON files
generated by GeoServer.
KML is an interesting format, and there is also a lot of possibilities
for using Google's maps. But I don't think you should spend time on
including polygon geometries in the DXF yet. Point coordinates could
perhaps be included, that's cheap.
But the InSTEDD toolset definitely has a lot to offer.
k
···
On 4/24/09, Bob Jolliffe <bobjolliffe@gmail.com> wrote:
A timestamp diff instead of file-based diff. Getting a new export out
and then doing a file diff with last export is a huge overhead.
OK. So Lars' suggestion of using lastUpdated flag solves this problem?
As for gis import export, i suggested we add something 2 the
blueprint. Either in dxf or java object serialization. Both have some
merits/demerits wrt our prob. Oracle had that gis type in the db for
something. Those arguments are valid in our context.
OK. Add something to the blueprint and lets look at it. I notice the
mesh4x project which Knut has been getting charmed with is using KML.
As is OpenLayers. Probably would make sense to use that. Either way
we need to look first at what GIS data we would be
importing/exporting.
Since this discussion moved from roadmap to import-export and then
interchange formats... lemme add something to chew for on
import-export.
Why aren't we using a diff-based import-export?? Suppose I already have
an
exported file and the new export file only should contain the new
values,
we
can generate only the changed values in the export process. Definitely
reduces time/ file size for places where import-export will be used
monthly/weekly.
Thinking out loud: the problem with a diff is that you need to have
something to diff against. So the program would have to be able to
read the exported file you already have in order to un-select these
values from the set in order to export. On the other hand it would be
quite trivial (and maybe in the end quicker) to have a dxf-diff
process which can generate the difference between two exported files.
Still its an interesting thought ...
Has anyone thought of object serialization?? Where we could serialize
objects and move it to new places... A radical and stupid way, but may
be
useful for GIS import-export, where the dxf format may fall short on
representing sharable geographic information. Disease surveillance on
the
CBHS may be one candidate...
Can of worms here - the respective merits of native serializing
(presumably) vs serializing to XML. Personally I would always go for
the latter.
I am not up to speed on exactly what shareable geographic information
we might want to share, but it makes sense to use an open geographical
markup language to do it (like GML) rather than trying to reinvent the
wheel in dxf. Spurious example:
Thats not a good example. Real life might be more interesting ... but
there is no good reason why we couldn't quite easily get geographic
information into dxf. The beauty of namespaces.
Cheers
Bob
---
Regards,
Saptarshi PURKAYASTHA
Director R & D, HISP India
Health Information Systems Programme
2009/4/23 Lars Helge Øverland <larshelge@gmail.com>:
>
>>
>> 6 complicates the current import strategy where objects get
>> imported
>> "on
>> the fly" without temporary storage, maybe we can put this on hold.
>
> Actually this holds for 5 as well...
>
I'll have to look at this more closely - though I know you will have a
better idea of what is going on. In principle it shouldn't really
matter. All that's happening in 5 is that we import a <Period> object
(on the fly, in its entirety or what have you). Then we just use the
inherited periodID attribute as we import all the contained
dataValues. I am not seeing how this would be fundamentally
different. Could be I'm just tired and in need of a beer ...
Cheers
Bob
--
Sent from my mobile device
---
Regards,
Saptarshi PURKAYASTHA
Director R & D, HISP India
Health Information Systems Programme
The new GIS solution for DHIS 2 is not using KML, but GeoJSON.
Conversion between these formats and PostGIS or Mysql/H2 spatial
tables is possible, but for now we will mainly be using JSON files
generated by GeoServer.
Ah. Object serialization ... we've been here before
This is way off my patch, but doesn't GeoServer also generate KML?
And OpenLayers can read both. Any particular reason for going the
GeoJSON route? It only becomes an issue if we want to integrate
geo-spatial metadata with "general" dxf data as part of an export.
Then KML would be much handier.
KML is an interesting format, and there is also a lot of possibilities
for using Google's maps. But I don't think you should spend time on
including polygon geometries in the DXF yet. Point coordinates could
perhaps be included, that's cheap.
Hadn't really planned to spend any time on it! Mind you if you wanted
to include kml polygons it would be pretty straight forward:
A timestamp diff instead of file-based diff. Getting a new export out
and then doing a file diff with last export is a huge overhead.
OK. So Lars' suggestion of using lastUpdated flag solves this problem?
As for gis import export, i suggested we add something 2 the
blueprint. Either in dxf or java object serialization. Both have some
merits/demerits wrt our prob. Oracle had that gis type in the db for
something. Those arguments are valid in our context.
OK. Add something to the blueprint and lets look at it. I notice the
mesh4x project which Knut has been getting charmed with is using KML.
As is OpenLayers. Probably would make sense to use that. Either way
we need to look first at what GIS data we would be
importing/exporting.
Since this discussion moved from roadmap to import-export and then
interchange formats... lemme add something to chew for on
import-export.
Why aren't we using a diff-based import-export?? Suppose I already have
an
exported file and the new export file only should contain the new
values,
we
can generate only the changed values in the export process. Definitely
reduces time/ file size for places where import-export will be used
monthly/weekly.
Thinking out loud: the problem with a diff is that you need to have
something to diff against. So the program would have to be able to
read the exported file you already have in order to un-select these
values from the set in order to export. On the other hand it would be
quite trivial (and maybe in the end quicker) to have a dxf-diff
process which can generate the difference between two exported files.
Still its an interesting thought ...
Has anyone thought of object serialization?? Where we could serialize
objects and move it to new places... A radical and stupid way, but may
be
useful for GIS import-export, where the dxf format may fall short on
representing sharable geographic information. Disease surveillance on
the
CBHS may be one candidate...
Can of worms here - the respective merits of native serializing
(presumably) vs serializing to XML. Personally I would always go for
the latter.
I am not up to speed on exactly what shareable geographic information
we might want to share, but it makes sense to use an open geographical
markup language to do it (like GML) rather than trying to reinvent the
wheel in dxf. Spurious example:
Thats not a good example. Real life might be more interesting ... but
there is no good reason why we couldn't quite easily get geographic
information into dxf. The beauty of namespaces.
Cheers
Bob
---
Regards,
Saptarshi PURKAYASTHA
Director R & D, HISP India
Health Information Systems Programme
2009/4/23 Lars Helge Øverland <larshelge@gmail.com>:
>
>>
>> 6 complicates the current import strategy where objects get
>> imported
>> "on
>> the fly" without temporary storage, maybe we can put this on hold.
>
> Actually this holds for 5 as well...
>
I'll have to look at this more closely - though I know you will have a
better idea of what is going on. In principle it shouldn't really
matter. All that's happening in 5 is that we import a <Period> object
(on the fly, in its entirety or what have you). Then we just use the
inherited periodID attribute as we import all the contained
dataValues. I am not seeing how this would be fundamentally
different. Could be I'm just tired and in need of a beer ...
Cheers
Bob
--
Sent from my mobile device
---
Regards,
Saptarshi PURKAYASTHA
Director R & D, HISP India
Health Information Systems Programme
The new GIS solution for DHIS 2 is not using KML, but GeoJSON.
Conversion between these formats and PostGIS or Mysql/H2 spatial
tables is possible, but for now we will mainly be using JSON files
generated by GeoServer.
Ah. Object serialization … we’ve been here before
This is way off my patch, but doesn’t GeoServer also generate KML?
Yes, no problem.
And OpenLayers can read both. Any particular reason for going the
GeoJSON route?
We have a Javascript client which works with JSON. We are planning to avoid including Geoserver to keep things lite.
It only becomes an issue if we want to integrate
geo-spatial metadata with “general” dxf data as part of an export.
By geospatial metadata, you mean polygons? It’s something we could include in the future, but it is something that is relatively static, so for now, distributing that in separate files should not be a problem. Still, it may be an interesting option to have in the future.
Then KML would be much handier.
KML is definitely on the table with a number of solutions we are looking at in Geneva as well. So yes, but less urgent.
Hadn’t really planned to spend any time on it! Mind you if you wanted
to include kml polygons it would be pretty straight forward:
Straightforward, but big, if you have any kind of accuracy.