Issue in Generating Analytics in 2.33 ERROR: index row size 6176 exceeds maximum 2712 for index

Dear Team,

I am facing issue in generation of Analytics in 2.33 with
Version: 2.33.0
Build revision: 071f135
Build date: 2019-10-07 14:46

Details error is

Caused by: org.springframework.dao.DataAccessResourceFailureException: StatementCallback; SQL [create index “in_iL5adZsIS53_geom_ax_qdkgabb5jlk_2017_u9vTI” on analytics_event_temp_qdkgabb5jlk_2017 (“iL5adZsIS53_geom”)]; ERROR: index row size 6176 exceeds maximum 2712 for index “in_iL5adZsIS53_geom_ax_qdkgabb5jlk_2017_u9vTI”
Hint: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.; nested exception is org.postgresql.util.PSQLException: ERROR: index row size 6176 exceeds maximum 2712 for index “in_iL5adZsIS53_geom_ax_qdkgabb5jlk_2017_u9vTI”

1 Like

@dhis2-analytics. @dhis2-platform Requesting help! Thanks!

1 Like

Hey, we are looking into the issue. It might be related to our transition to using Geometry, but we will get back to you as soon as we find out something.

1 Like

thanks

1 Like

@Mithilesh_Kumar_Thak

I encountered the same problem yesterday, when upgrading the Sierra Leone eCBDS (IDSR) from 2.31.6 (analytics runs fine), via 2.32.2 (runs fine), to 2.33.0 (analytics crashes, with an error message similar to the one posted from you).

A chat with @Stian earlier today indicated that this problem relates to the introduction of “_geom” columns in the analytics tables. If you look at that event table you have, you will see that you have two related columns: one called “iL5adZsIS53” and one called “iL5adZsIS53_geom”. It is the latter type that’s new in 2.33

I was quickly able to ascertain that my variant of the same was the column that contains the organisation unit data FOR THE OU LEVEL WITH MOST POLYGONS, and I suspect it is the same in your case. In my case, the OU level with most polygons was the Chiefdoms at level 3. That level has only 162 chiefdoms (each with its own polygon = boundary data), so it is actually small - but still large enough to crash analytics.

That said, I decided to run a geometry simplification update on the organisationunit table, to see whether a certain amount of simplification (“smoothing” the boundaries) would reduce the complexity sufficiently to avoid analytics crashing.

Postgis supports a range of algorithms for simplifying polylines, but the safest to use for typical admin boundaries IMHO is the Douglas-Peucker algorithm because it do not created any possibly invalid new geometries in the process. All DHIS2 geometry data is in geographic decimal degree format (SRID 4326), so the related “ST_SimplifyPreserveTopology” function has two parameters: the first is the geometry column - called “geometry” in all DHIS2 tables containing them - and the second is the tolerance (in decimal degrees!!) to use for the simplification.

So I started with a very small tolerance (= little or no simplification), and ran the following sql script in the db:
UPDATE organisationunit
SET geometry = ST_SimplifyPreserveTopology(geometry, 0.00001);
That did NOT resolve it - analytics was still crashing.

Since my experience is that most DHIS2/CoP users have limited knowledge of spatial systems, note that in SRID 4326 (geographical, not projected, coordinates) we have the following rough equivalents between decimal degree coordinates and distance on the ground:

  • 1 degree is in general 60 nautical miles = ~110 km
  • 0.1 deg = ~11km
  • 0.01 deg = ~1.1km
  • 0.001 deg = ~100m
  • 0.0001 deg = ~10m
  • 0.00001 deg = ~1 metre

So the ST_SimplifyPreserveTopology function was initially run with a tolerance in the 1 metre range.

I then gradually increased the tolerance value:

  • 0.0001 (10 m range) - analytics still crashing
  • 0.001 (100m range) - analytics still crashing
  • 0.01 (1km range) - ANALYTICS NOW COMPLETED.

And interestingly, when I afterwards visually compared the original boundaries with the simplified ones, it was difficult to see any significant differences - at least in the Freetown area. It is therefore possible that simplification of the chiefdom and district layers for Sierra Leone, using a tolerance of 0.01 deg, mainly eliminated a FEW dense polyline points in some areas (I don’t have the time to investigate exactly which simplifications that were done).

So if you are in a hurry to get analytics to run again, consider running the Douglas-Peucker algorithm
/ ST_SimplifyPreserveTopology function on your organisationunit-geometry column. Remember to make a backup of the db, and gradually increase the tolerance value. I can give no guarantees that this will work for you, of course, but it did provide a short-term solution for me.

It is up to the core team to ensure a permanent solution to this (@Lars is looking into it, according to @Stian)

Best regards
Calle

1 Like

Dear @calle Hedberg

Thanks for reply as you suggested i have already done these processes but still facing issue

This only come only events analytics

I have also set value null to gemotery coloum in organisationunit table but still same issue.

Thanks again for your response/support

Thanks and Regards

Mithilesh Kumar Thakur

1 Like

@Mithilesh_Kumar_Thak

If nulling out the geometry column in the organisation unit table did not fix the problem:

Did you verify that iL5adZsIS53 in fact refers to the orgunit table? Or you are sure that you don’t have any other polygon geometries in your instance?

It makes little sense that removing all geometry (polygon) data should not at least temporarily solve the problem (especially since simplifying the polygons resolved it in my case).

Regards
Calle

1 Like

Thanks Celle Hedberg

Thanks for reply i will check and let you know

Thanks and Regards

Mithilesh Kumar Thakur

1 Like

Hi

For everybody’s info: I created JIRA issue 7976 for this yesterday, Lars assigned it to Luciano, and Luciano worked on it on his way to a holiday in India (commitment de luxe!!), and then sent me the following after landing:

“hi Calle, the issue is fixed. Basically, when we build an index for geometry column we need to use the USING GIST clause, as specified here: 15. Spatial Indexing — Introduction to PostGIS

The solution is submitted for review, so we can expect the fix to be implemented early next week. Since it’s unlikely that they will re-release 2.33.1 with this fix, I guess anybody having this problem will have to use 2.33 dev for now, until the release of 2.33.2 some time next year (?)

Best regards
Calle

1 Like

Hi

Fix was included into 2.34 and 2.33dev this morning.

Regards
Calle

1 Like