NOTE that I used the “skip checking” option, because I was importing into an instance where I had deleted all records in the datavalue table.
I will try to reproduce it on demo - if it does not happen there, I can make the “culprit” instance available. I initially noticed those few duplicates because I was checking the number of records in the datavalue table against the number of rows in the CSV file - then when importing, it would bomb out when encountering the first duplicate.
···
On 7 December 2015 at 14:31, Lars Helge Øverland larshelge@gmail.com wrote:
Hi Calle,
that’s strange - I just tested by importing a CSV data value file on the demo and duplicates were ignored. What version are you on? Can you reproduce on demo?
regards,
Lars
–
On Fri, Dec 4, 2015 at 11:31 AM, Calle Hedberg calle.hedberg@gmail.com wrote:
Hi
FYI, I’ve just done another import of a 120mb xml-zip file - upload took around 30 minutes and actual import around 18 minutes.
Regards
Calle
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
–
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
http://www.dhis2.org
On 3 December 2015 at 20:34, Calle Hedberg calle.hedberg@gmail.com wrote:
Hi
By the way, using CSV should reduce file size and speed up the import, but there seems to be a bug somewhere in the CSV export: the total number of records exported was slightly higher than the actual number of data records selected (we are talking 3-4 duplicated records out of 8 million), and as a result the imports crashes as soon as the first duplicate record is encountered. This does not happen when exporting xml. I did drill into one of these duplicates and found it to be the “first” orgunit in the alphabetical list and also the “earliest” period in the source system.
Regards
Calle
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19119
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
On 3 December 2015 at 20:28, Calle Hedberg calle.hedberg@gmail.com wrote:
Jason,
I fully understand that, and it’s only done infrequently (and outside office hours if it’s a production instance).
The 75MB xml-zip file that I just uploaded had only around 8 mill records - the upload took 20 minutes and the actual import around 13 minutes. No problemo…
Regards
Calle
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19119
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
On 3 December 2015 at 20:12, Jason Pickering jason.p.pickering@gmail.com wrote:
Hi Calle,
I think you would want to be very careful with this. If you change the maximum file size to 200 MB, this could potentially be an unzipped file of several (tens) of gigabytes , or several million rows of data. This could put significant stress on the server, and is the entire point of the restriction really, to prevent huge uploads from being imported. If you have limited who can upload data to the server, it may be OK, but just be aware that a zip file of 200 MB, can be much much larger (by an order or magnitude or two), and result in a very long process.
Regards,
Jason
On Thu, Dec 3, 2015, 18:52 Calle Hedberg calle.hedberg@gmail.com wrote:
Hi
We are now standardising on 200M
Regards
Calle
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19119
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
On 3 December 2015 at 17:38, Alan Ivey aivey@baosystems.com wrote:
Also, it’s worth noting that the default for “client_max_body_size” is only 1 MB: http://nginx.org/en/docs/http/ngx_http_core_module.html#client_max_body_size . It will need to be increased on most deployments of DHIS2.
On Thu, Dec 3, 2015 at 9:09 AM, Lars Helge Øverland larshelge@gmail.com wrote:
If you are indeed using nginx, the “client_max_body_size” directive is part of the installation docs example, can be increased as appropriate:
https://www.dhis2.org/doc/snapshot/en/implementer/html/ch08s04.html#d5e575
Lars
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
On Thu, Dec 3, 2015 at 3:02 PM, Lars Helge Øverland larshelge@gmail.com wrote:
Hi Calle,
I think this depends on the web server configuration. One can configure max file size for uploads in both the proxy (nginx, apache) and servlet container (tomcat).
On nginx the directive is:
client_max_body_size 200M;
regards,
Lars
–
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
http://www.dhis2.org
On Thu, Dec 3, 2015 at 2:44 PM, Calle Hedberg calle.hedberg@gmail.com wrote:
Hi,
I have found that there is a limitation in file size when importing data into our SERVER-based instances, while I have found no equivalent limitation when importing large data files (e.g. XML format) into an equivalent localhost instance. A few key aspects:
- Both the server and localhost are running the latest version of 2.20
- Both are running java 8 64 bits and tomcat 8.026 or 8.029
- Localhost tomcat has 4GB (min) and 8GB (max) allocated
- The server instance (running under Ubuntu Linux) has ~5.3GB RAM, but increasing/decreasing RAM has no effect on the issue.
The problem is related to the upload process.
Example:
When importing a 75MB data file with around 8 mill data records (XML, zipped) on localhost, the initial upload step is almost instantaneous (2-3 seconds) and then the actual import starts (takes about 10 minutes overall).
When importing the same file to the equivalent instance on the server, it takes around 30 seconds to reach 2% upload and then the upload re-starts at 0% - this goes on ad infinitum.
Smaller files - e.g. 10-20MB - will maybe import 15-20%, then reset to 0% and start over.
It seems to me that the problem is related to the DHIS2 web server configuration, it do not allow sufficient time for the upload to happen.
Any indications of how to fix this would be appreciated. While dumping the server instance into localhost, import the data, and then upload/restore the instance does work, it is a pain in the b…
Regards from a sunny Cape Town
Calle
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19119
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
Lars Helge Øverland
Lead developer, DHIS 2
University of Oslo
Skype: larshelgeoverland
http://www.dhis2.org
–
–
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19119
Email: calle.hedberg@gmail.com
Skype: calle_hedberg
Calle Hedberg
46D Alma Road, 7700 Rosebank, SOUTH AFRICA
Tel/fax (home): +27-21-685-6472
Cell: +27-82-853-5352
Iridium SatPhone: +8816-315-19119
Email: calle.hedberg@gmail.com
Skype: calle_hedberg