Transforming High-Resolution Air Quality Data for DHIS2 Analytics through DHIS2 Climate Tools

A joint initiative between HISP Sri Lanka, the Ministry of Health of Sri Lanka, the HISP Centre, and CICERO (Norway) focused on integrating high-resolution air quality data into DHIS2 to support health-environment analysis. CICERO analyzed satellite-derived aerosol optical depth data and applied bias correction using ground-level PM2.5 measurements from the National Building Research Organization (NBRO) of Sri Lanka. This produced a 1 km² resolution, bias-corrected daily PM2.5 dataset for the period 2020–2024. However, the challenge lay in transforming this large, gridded NetCDF dataset into a format compatible with DHIS2, which required harmonization by organizational unit and time period. Using the DHIS2 Climate Tools and its Python utility library dhis2eo, a workflow was developed to automate the process of reading the NetCDF data, aligning it spatially with DHIS2 district polygons, and preparing population-weighted PM2.5 averages for each district. The workflow leveraged the xarray and rioxarray libraries for efficient processing of multidimensional gridded data, while dhis2eo.integrations.pandas converted the processed outputs into DHIS2-friendly JSON structures. These were then uploaded via the Python client, enabling seamless integration of large-scale environmental data with health information. This implementation demonstrated how DHIS2 Climate Tools can technically bridge advanced environmental datasets and DHIS2 analytics. By automating the transformation and import process, it allowed air quality data to be visualized alongside health indicators, laying the groundwork for future analyses on pollution exposure and public health outcomes. The approach established a replicable model for other countries aiming to operationalize climate and environmental data within their DHIS2 ecosystems.
We intend to make this workflow openly available through the DHIS2 Climate Tools repository so that other country teams and implementers can adapt and reuse it for similar applications. We welcome guidance from the core DHIS2 Climate Tools team on the best approach to document and contribute this workflow to the repository.

3 Likes

Thanks for the detailed description. This is an exciting use case for us. For climate tools I think there are at least two features we should showcase:

  1. Population weighted aggregation
  2. Handling big datasets with the build-in Dask support in xarray

I would advise to make a pull request to the DHIS2 Climate Tools repo with a Jupyter Notebook named “Population weighted aggregation” placed in the “Data aggregation” folder.

This notebook could contain a subset of the gridded PM2.5 dataset for a few districts in Sri Lanka. We can download WorldPop data and crop it to the same districts. This is to reduce the size of the dataset so we can provide a running example. We can help with these steps if needed.

The Jupyter Notebook should combine code with short explanations for each step. If we use population data from WorldPop, it should be sufficient to use xarray (not rioxarray), as both datasets use geographic coordinates (longitude and latitude).

1 Like