I’m seeking input on how others navigate data management for event data. We often have projects asking us to import or update data from a csv/excel file into dhis2. These files vary in size from 150 cases to 5,000 cases. The metadata from the excels match the program metadata, so it is just a matter of importing the data.
I’ve used the Data Import Wizard for this work in the past but it no longer works on the version of dhis2 we’re on.
When testing the Bulk Load app, I experienced significant delays for tracker program data import/updates. Is this expected and experienced by others?
The native import/export works, but not for the larger files. For example, I have 5,000 events I want to update that is formatted for the import/export app. The app shows an ‘In progress’ status forever.
How are others navigating these challenges? Is there another method to complete this type of work?
Could you focus a bit on this? Before starting the import, could you open the Network tab and the Console tab in the DevTools (F12) to check if there’s any issue?
After that, would you check the Catalina.out log to know if there are issues appearing in the log to why the progress is taking forever.
We’ve noticed similar things @lnfregos . We’ve definitely had the most consistent positive experience with import/export and using Jsons. We also had to increase the size limit on what was able to be imported for some of our work. We also limit imports to 5,000 and page them out after this, as we’ve noticed less success when going over this amount (though, its not consistent). That said we’ve worked primarily in event program model.
Bulk Load has been up and down for us, for sure. The latest version does seem more stable, but the errors that come out of it are still quite basic and difficult to follow. Our success has generally come from careful detail of formats of cells, and also toggling off all resource-heavy options like validations during import.
Doesn’t feel like that’s very helpful as a comment…mostly just saying you’re not alone!
There was no issue in the Network tab when I opened the Import/Export app. I’ve attached the Catalina.out logs from when I was doing this work catalina-training-logs.json (70 KB)
.
When using a smaller file I’ve also run into issues with the import/export app. I’m looking to import new events to an already existing enrollment. When following the documentation on headers (attached below) I get an error in the summary …
“Event.trackedEntityInstance does not point to a valid tracked entity instance: null”
I added the tracked entity instance IDs to the import file using the header ‘trackedEntityInstance’ and I get another error as a popup (not in the summary) at the bottom of my window, picture below.
Note that as my goal is to create events that don’t already exist in the system, the event uids in my csv file came from the api call /api/system/id.csv?limit=#
Have you tried other options like Bulk Load app? We have used that previously to import records and the verbose log is helpful in identifying where the process is at any instance. It’s available on the play dhis,2 store.
Yes, we’ve explored the bulk load app but like I mentioned above, it cannot handle the load needed for multi-stage programs. Do you have the same experience?
As for me, Bulk Load was much easier and unsuccessful because it showed that duplicate events exist for the events I was trying to upload. Yet they do not exist!
This was for when I had the columns status as COMPLETED and also had the column trackedEntityInstance to associate the event with a TEI. At the bottom you can see there seems to be a loop.
I slightly changed the csv format and also tracked on the logs