Accessing indicators or multiple dataElements within subExpressions

Hello,

I’m trying to do a calculation that is maybe the limit of what indicator analytics are meant to do (v. 2.40.3).

The goal is to have an indicator that represents how many health facilities have cases over a facility-specific threshold per health district for a specific three month period (current month +2). The threshold is based on the long term average of the case number for that same three month season over the prior three years, which is stored as a separate dataElement. I know technically subExpressions only take one dataElement, but it seems to be somewhat working with two and returning a Valid expression.

The way I think I should write out this formula would be:

subExpression(if((#{DE1} + #{DE1}.periodOffset(1) + #{DE1}.periodOffset(2)) > (#{DE2}.periodOffset(-12) +#{DE2}.periodOffset(-11) + #{DE2}.periodOffset(-10) + #{DE2}.periodOffset(-24) + #{DE2}.periodOffset(-23) + #{DE2}.periodOffset(-22) +#{DE2}.periodOffset(-36) + #{DE2}.periodOffset(-35) + #{DE2}.periodOffset(-34))/3,1,0)).aggregationType(SUM)

In English, for each facility, if the number of cases predicted over the next three months is greater than the average over the prior 3 years for the same period for the 2nd visit, return a 1. Then sum these to the district level, resulting in the number of facilities over this threshold.

The aggregation doesn’t seem to be working for this, as it’s still returning it at the district level as a 0, even when I can count that the values are different.

One solution I tried, if only to make this easier to manage and read was to create an indicator corresponding to each side of the equation in the if statement, but it seems like indicators don’t work within subExpressions. I would use a predictor, but I need to be able to specify these months in a very specific way that doesn’t seem possible with the sequential sampling functionality.

Any ideas/thoughts? Happy to share a seed of some fake data to play around with.

Hi @mv_evans

The three months that you are comparing to the threshold are they months that already have data? If there is no data then using indicators is not an option to the best of my knowledge.

Hi @Gassim , yes they already have data.

We are making some DHIS2 indicators to support a forecasting application, so there are some dataElements that exist for the future. The thresholds are only made from historical data, however.

Hi @mv_evans

Oh, okay, I think I get it better now, so in other words, we’re not using the DHIS2 Indicators to ‘predict’ since we’re already giving the predicted values, right?

Wouldn’t be easier if you used combined indicators instead of subexpressions?

  • Expression: count health facilities.
  • Filter: count (events or enrollments?) > threshold

Ah okay, so rather than a SUMIF statement we would have:

  • one indicator that represents the month x health facility threshold
  • one indicator that represents the forecasted counts
  • one indicator that is the filter of the facilities with forecasted values above its threhold
  • one indicator that is the count of the filtered facilities

Is this kind of what you are imagining? I’ve currently got some indicators for the first two to represent kind of intermediate calculations, but am having trouble getting the last ones because I can’t combine two indicators in a subExpression.

I’ve currently just wrapped it all in a Python script that GETs the relevant indicators, counts them and POSTs the count as a new dataElement. Probably not the the most elegant, but can be run with the rest of the Python updating scripts to simplify things.

1 Like