Join members of the DHIS2 core team and HISP network for a presentation on implementation guidance for optimizing large-scale Tracker performance, with a particular focus on COVID-19 vaccination and surveillance systems. This webinar will take place on Friday, 21 January, from 14:00-15:30 Oslo time (GMT+1).
Complete this form to register for this free event, and you will receive a link to the webinar by email before the event takes place: online registration form
All registered participants will be invited to join the webinar on Zoom. You can also view a live stream of the webinar here on the DHIS2 Community of Practice or on the DHIS2 YouTube channel.
Please add any questions or comments in the thread below!
A first question for the webinar: How do the program indicators work ? are they calculated online ? Like , is it when I do the analysis that the calculations are made ?
In preparation for this webinar, the Tracker team has prepared written guidance on optimizing Tracker implementation at scale. You can read it on the DHIS2 Documentation site: Tracker performance at scale - DHIS2 Documentation
The program indicators are evaluating individual events/enrollments at the time of the analysis. This means that when a user opens a dashboard that displays a program indicator, this act will directly trigger a query that evaluates the program indicator. Program indicators that has complex evaluations to perform, and/or queries a lot of data, will be expensive to evaluate. It is common that enrollment program indicators are heavier than event program indicators.
The tracker-to-aggregate(T2A) mitigation that gets presented reverses this; instead of triggering evaluations when dashboards are loaded by end users, the program indicators are pre-evaluated at certain intervals. The pre-evaluated values are stored in aggregate data elements, removing the need for re-evaluating when users open the dashboards.
I am thinking about another solution but it surely requires a big update of the functioning of the program indicators.That is, the data resulting from a program indicator is calculated and converted into aggregate data when generating the analytics. For this to work, the program indicator must have 2 states (PROD, DEV). If it is in the PROD state, it is calculated and converted during analytics.
I believe that an architecture option would be to have separate DHIS2 Tracker instances for smaller geographical or organisational units that are likely to work together and might share data. For example districts or municipalities, thus limiting the covered population and transactional data to a reasonable volume needed to be analysed real time.
Aggregated data could also be held locally, and relevant aggregated data could be fed to a national DHIS2 instance for the total overview on a daily basis.
A messaging subsystem might be needed if the data volume is large and the API becomes a bottleneck.
This webinar will start streaming at 14:00 Oslo time (GMT+1)! To watch it here on the CoP, just click the “play” button in the banner at the top of this page.
To ask questions or add comments, click the “reply” button at the bottom of the page.
It’s in fact not in the guide, though we’ll add a link there as well. The self-assessment/checklist is here, and it goes a bit beyond just the server - in many ways it summarises key points from the whole guide.
From Peter Linnegan: What are people’s thoughts on adding PIs to the analytics tables, so they don’t need to be calculated on the fly every time?
Answer: That can be a good idea in some cases but not all. There have been some implementations that as part of their analytics tables run the PIs nightly, and then when requested by the analytics just pull the any new data since the nightly run. This is really only applicable when you have very tight parameters on how those PIs can be visualized. That means that the PI does not have many disaggregatable dimensions, for example age or sex.
Do you have any advice for load testing tracker? We’re using it for hypertension where there will likely be tens of thousands of patients enrolled in a district, 100K+ overall.
We’re trying to use the API to import TEIs, but maybe there are better approaches or existing research?
From Android we don’t say they are problematic per se, but if you are for example working on an online implementation and you have 10000 devices might not make sense to have a 1000 reserved values per device. As they will ask for them when generated. These are things you can tweak in the Android Settings WebApp.
The UUIDs are not downloaded but generated offline in each device.
Please let me know if this clarifies your question.
We have a load testing framework based on locust and java and we aim to make the tool generic. If you are interested to try it out (pilot), contact us at performance@dhis2.org. Generally, whatever tools you decide to use, you want to load test the most important user flows using the API: creating TEIs, loading them, searching. It’s also extremely important that your test instance is very similar to production instance in terms of database size, hardware.
A number of countries have come up with various public facing portals during the covid pandemic in order to allow people to pull vaccine certificates or testing results. There are a number of challenges with going beyond viewing data and instead allowing patients to enter their own information, both in the data model (where people are tracked entities rather than users) and when it comes to considering source and validity of data. We do not have a core solution for this at the moment, although we are looking into it and learning from use cases as they come along. TEI self registration has successfully been done in some instances by setting up a public facing questionnaire, and creating a middleware that inserts the registrations in the dhis2 database.