DHIS2 Tracker to Predictive Modelling using Fabric

This community innovation has been accepted at the 2026 DHIS2 Annual Conference and will be included a session.


DHIS2 Tracker to Predictive Modelling using Fabric

Background: DHIS2 Trackers capture rich longitudinal data but are often used primarily for retrospective reporting. As volumes grow, API based analytics can strain performance. The U.S. Government funded Meeting Targets and Maintaining Epidemic Control (EpiC) project supports large scale HIV service delivery, creating demand for dashboards, scalable tracker to aggregate (T2A) reporting, and predictive modelling. Methods: We aim to develop an end to end Microsoft Fabric architecture that operationalizes both scalable analytics and responsible machine learning. We developed a data pipeline to load DHIS2 data into OneLake. Data are published as a master data model, providing analysts with an easy to access data source. To accomplish T2A we used Fabric notebooks (PySpark) pushing the results into DHIS2 nightly. Next, we will rely on notebooks to perform feature engineering and classification model development, to understand whether we can predict outcomes of key clinical significance such as seroconversion, interruption in treatment, and increased viral load. Lessons Learned: Our previous API based approach required staff to repeat the same data extraction processes multiple times, straining the system and causing data inconsistencies across dashboards. This approach provides a single source of truth, enabling analysts to build dashboards, decentralizing workloads and empowering data driven decision makers. Using traditional T2A methods (see PDAC, Aggregate Data Exchange), aggregating data for < 10% of indicators in one country took over 48 hours. With the new approach the aggregation of all data takes 16 minutes, allowing daily updates without impacting system performance. Conclusions & Future Directions: Microsoft Fabric provides scalable foundation for DHIS2 analytics and responsible AI. Future work will explore Microsoft tools for predictive analysis.

Primary Author: Pradeep Kumar Thakur


Keywords:
Data Modelling, AI, HIV, Analytics

3 Likes