High CPU Usage on DHIS2 2.38.4.3

Is there any experiencing high CPU Usage on DHIS2 2.38.4.3? Any ideas are welcome.

Hi @Alex_Tumwesigye

Welcome back to the community! :slight_smile:

It might help more if you provide more information about your instance’s environment and setup. Additionally, you might find this topic helpful: High CPU use in a Linode server

Thanks!

I would try to gain a bit of visibility in what the machine is doing

  1. the machine : sudo top -bn1
    1. find the process eating the cpu or if the machine is swapping
    2. you can also have a look a tools like glances or htop
  2. get a grasp of what the java process is doing : this will produce thread dumps with the call stack of the executing code
    for PID in jps | grep -v jps | cut "-d " -f1; do jstack $PID ; done'
    (note you can use that site to parse and analyse a bit the dumps)
  3. the db : to get the running queries on the db with psql you can launch
    COPY (select * from pg_stat_activity) TO STDOUT WITH CSV HEADER
  4. monitor the access/error logs (is there a robot, or over accessed api/page)
    cat /var/log/nginx/access.log | awk '{print $7}' | sort | uniq -c | sort -nr | head -n 20 or
    cat /var/log/nginx/access.log | awk '{print $9}' | sort | uniq -c | sort -nr | head -n 20
    (adapt the field number according to your log format and http server)
  5. check if a heapdump is not produced on your system (you might see a huge file)
  6. for the java process you might be interested by this jvm-mon to profile/monitor the jvm or async-profiler (or use a commercial java profiler)
  7. for the db you can have a look at pg_activity

Note that it’s possible to automate all this and produce reports
Launching several times or when the phenomenon is observed
This might allow you or a dhis2 developer to see a trend or spot some suspects.

In the usual suspect

  1. analytics or continuous analytics running
  2. the machine is swapping
  3. the java process continuously heap dumping or triggering a lot of garbage collection (check the xmx xms options)
  4. a script/batch using the api and hitting the same endpoint (or not reusing the session)
  5. a bot trying a lot of url to find an exploit