Do we still need to run Analytics if Continuous analytics is already running

Do we need to run main traditional analytics, if we have continuous analytics already running for (say every 30 mins)

1 Like

Hi @jthomas

The continuous analytics table job is meant specifically for real-time (latest) data updates and has a limitation to what it updates (mostly limited to only data added/changed since it last ran); for instance, it doesn’t take metadata changes into consideration.

You don’t need to run the full analytics tables export if you don’t have metadata changes and if you’re already doing the full update once a day which is still important; however, if there has been any changes to the metadata, make sure to run the full analytics tables export.

Actually, I think the docs provide a much better explanation:

Continuous analytics table

The analytics tables job is responsible for generating and updating the analytics tables. The analytics tables are used as basis for data analytics queries in DHIS2. Apps such as dashboard, visualizer and maps retrieve data from these tables through the DHIS2 analytics API, and they must be updated in order for analytics data to become available. You can schedule this process to run regularly through an analytics table job type.

The continuous analytics table job is based on two phases:

  • Latest update: Update of the latest data, where latest refers to the data which has been added, updated or removed since the last time the latest data or the full data was updated. This process will happen frequently.
  • Full update: Update of all data across all years. This process will happen once per day.

The continuous analytics table job will frequently update the latest data. The latest data process utilizes a special database partition which is used to hold the latest data only. This partition can be quickly refreshed due to the relatively small amount of data. The partition will grow in size until a full update is performed. Once per day, all data for all years will be updated. This will clear out the latest partition.

The analytics table job will by default populate data for all years and data elements. The following parameters are available:

  • Full update hour of day: The hour of the day at which the full update will be done. As an example, if you specify 1, the full update will be performed at 1 AM.
  • Last years: The number of last years to populate analytics tables for. As an example, if you specify 2 years, the process will update the two last years worth of data, but not update older data. This parameter is useful to reduce the time the process takes to complete, and is appropriate if older data has not changed, and when updating the latest data is desired.
  • Skip resource tables: Skip resource tables during the analytics table update process. This reduces the time the process takes to complete, but leads to changes in metadata not being reflected in the analytics data.

(source)

I hope this clarifies it, but I do invite you for further discussion if there’s a bit that’s not clear. Thanks!

1 Like

Thank you so much @Gassim for such a beautiful explanation

1 Like

You’re welcome! :heart:

1 Like