Can I import written data from a photo?

My team built a form in Microsoft Word to collect data. The form is printed by our users and filled out by hand on a piece of paper. Unfortunately, asking users to complete the form digitally is not an option.

If we were to rebuild the form in DHIS2, is there a way to take a photo of the filled-out paper form and import the handwritten data so that it can be analyzed?

Importantly, the tool would need to recognize and differentiate between the pre-printed form prompts (e.g., “Name”) and the handwritten data (e.g., “Stephen Muse”) so that when we import the result, it pulls only the handwritten data and pulls it into the right field.

If anyone has any tips on whether this is possible and how to go about it, I’d really love to connect!

Thank you,
-Stephen

@slunamuse Welcome to the community! :grin::tada: As far as I know there are no tools built into DHIS2 to do this. Which means you would need to use a custom solution.

Analysing images to extract text is a common problem though so there are existing libraries in python, javascript and other programming languages to deal with that part of the issue.

You could also check the DHIS2 app hub to see if there are any existing custom tools to suit your need.

Otherwise a custom application will have to be developed.

1 Like

Hi @slunamuse and welcome to the CoP :smile:

There may be an OCR tool linking with DHIS2 out there, but not one I know of, and doesnt look like it on the App Hub.

Have you considered the “Insert data from picture” function in Excel? It seems prime for a data pipeline to digitally scan paper forms, manually verify them, then uploading into some kind of Extract/Transform/Load tool for importing into a data repository like DHIS2.

1 Like

Hi @plinnegan and @Brian! Thanks for the warm welcome; I’m thrilled to be here :slight_smile:

Peter, thank you for the suggestion of the App Hub; I am new to DHIS2, so I had not considered that - I will check it out and see what I might be able to find.

And Brian, that Excel function is an excellent idea! In fact, I was able to test that out yesterday, and it unfortunately didn’t quite meet our needs. It is only able to capture the form as a whole (rather than just the fields we fill in manually), and it has a tough time recognizing the table structure, so much of the text ends up in the wrong spot. But I’m hopeful that Microsoft will continue refining it because it has all the promise of being exactly what we need.

Thank you both again for taking the time to reply; I really appreciate it!
-Stephen

Ohh the excel thing is cool! @slunamuse I would say if it ends up in the wrong spot, but it ends up in the same wrong spot every time, this would not be a problem, because you could in a second tab use formulas to reference the original cells and essentially re-arrange the data into the correct format.

The data needs to be in a specific format to be imported into DHIS2 in the first place. So there is likley going to be some re-arrangement of the data structure needed in any case.

Oh Peter, that’s a really good point; I had not thought about using a secondary reference table in Excel to pull the data into the right format. I can run a few more tests and see how it goes!