How does DHIS2 protect personal information? (The tech details)

Stian · 26 November 2018 09:22

I don’t have the complete and detailed description for you, but I will share what I know. The information is based on the recent versions of DHIS (2.30+).

User accounts have stricter requirements for password, requiring a mix of characters, and can block users from using the same password they recently used.
DHIS2 implements different layers of security/access:
- Authorities
- Organisation Units ( including “Breaking the glass”)
- Metadata and data read/write

Authorities
Authorities can be split into two types: M_* authorities and F_* authorities. The M_* authorities decides whether you have access to a specific module or not. That effectively means apps. Without the appropriate authority the server will not save the app. The F_* authorities on the other hand, are based on actions (I usually refer to them as feature or functional authorities). These are required for special operations that should be reserved to specific users. Some examples include starting server-jobs, approving data and more. The authorities are mainly a layer of access for interacting with the software itself, but may in some cases provide additional access to data as well. The one important exception is the “ALL” authority, which grants the users a total override for access check. This is however not an authority normal users should have. A final note about authorities: Authorities are attached to a user roles, so you can design roles and assign them to users.

Organisation Units
The second layer of access is the organisation units. All data and some metadata is associated with an organisation unit. The users are also associated with organisation units: "Data capture and maintenance organisation units” (Capture), “Data output and analytic organisation units” (Analytics) and “Search Organisation Units” (Search). We often refer to these organisation units are “scopes”, so we have the capture scope, analytics scope and the search scope. The core idea is that for any given scope, we only have access to the selected organisation units for that scope and any children of that organisation unit.

Assuming you know the different data-models we have in DHIS2, tracker/events and aggregate data, I will just briefly point out how they affect each model. For tracker data, that means where the tracked entity instance(TEI) is associated, for event it where the event is associated and for aggregate data it’s where the data set/data value is associated.

The capture scope decides whether or not the user have access to add or edit data associated with a given organisation unit. This means users are only able to work with TEIs or create TEIs that is associated with a organisation unit within their capture scope, and similarly for events and aggregate, where the events or data sets are associated.

The search scope decides whether a users can see the existence of data or not. I believe this don’t apply for aggregate data, where viewing and editing is determined by the same scope, but for tracker data especially, that means you can have a bigger search scope than capture scope, to be able to find TEIs that might be associated with a different organisation unit. By default, users with search scope access to a TEI will be able to view any information for that TEI.

Since some information attached to a TEI and a program/enrolment might be sensitive outside the owning organisation unit, we have introduced a concept called “breaking the glass”. This feature allows a program to restrict access to it’s data on 4 levels: Open, Audited, Protected and Closed. Open is the default, which means anyone can read the data. Audited means any reads through the search-scope (Implying the TEI is outside the users capture scope, but within the search scope) will be recorded. Protected means the user have to actively confirm they want to access the data, state a reason and will be granted a time-limited access to read the data. This access is also recorded. The final option, closed, means you can only read the data if you have the organisation unit of the TEI in your capture scope.

A final note about TEI organisation units and “breaking the glass” before we go to the analytics scope: Since a TEI can have enrolments in different organisation units, other than the organisation unit it was first enrolled in, we introduced the concept of “ownership”. So an organisation unit will “own” a TEI and Program tuple. This means the tuple can move between organisation units without changing the factual data, but also means that when we check for search scope access, the TEI can belong to multiple organisation units. So if any owning organisation unit is within the search scope, the user can see them.

The final organisation unit scope, analytics, determines what analytical data the users are allowed to see. So a user who might only be able to enter data for a few organisation units, might be able to see the reported data for a bigger scope of organisation units. For aggregate data, this is pretty straight forward - however for tracker and event data, all analytical data processed by the backend is currently based on enrolments and events, so “ownership” is not affecting the data seen from this perspective.

Metadata and data read/write
The last layer of access control we enforce, is based on the metadata. Metadata itself can be public or private, or shared with specific users or user groups. Each of these rules are accompanied with 5 levels of access: No access, Metadata read, Metadata write, Data read and Data write. No access seems self explanatory, while metadata read and write simply restricts a users to either only being able to read the metadata, or also edit it. Data read and Data write on the other hand, refers to the data described by the metadata. For example a data element “Lab result” can have data read for everyone who needs to see the lab results of a test, while lab workers might have the data write access to be able to edit the value of the data element to match the test results. However, the result of the test might be sensitive, so other users might only have metadata read or even no access, implicitly removing the data element and associated values from their view.

This is a very shallow explanation of the concepts, and there is a lot more detail if required, but it should give you some core ideas on how we deal with access and privacy.

Here are some key documentation about the concepts I described as well. Some of the information is located in other sections as well, for example when configuring metadata.

@Lars @morten @Markus @mike feel free to correct me on any wrong assumptions or descriptions.