About Regex-Validation blueprint

I thought I should forward this conversation Hieu and I have been having for the benefit of the group. Perhaps there are others that have suggestions/comments/rants about how we should go about field level validation with regular expressions?

Regards,
Jason

···

---------- Forwarded message ----------
From: Jason Pickering jason.p.pickering@gmail.com

Date: Fri, Mar 26, 2010 at 10:31 AM
Subject: Re: About Regex-Validation blueprint
To: Hieu Dang Duy hieu.hispvietnam@gmail.com

Hi Hieu,
Well, we can always start simple, and go from there, but it is important that we architect everything properly.

So, I suggest that for everything out of the translation table, we use something like the data validation procedures. A user enter a rules, and then runs a check. Eventually, this could be extended to either the UI validation. But, my thinking was similar to that of DHIS 1.4, which allows data validation rules to be defined in terms of SQL, and then run by the user during a validation check.

So, lets try a a full example. I have a situation now, where I have organisationunit short names, with trailing spaces.

These there would need to be a UI screen to define the rule itself.
a) 1 (or something from the hibernate_sequence) (regexid)

b) No trailing spaces (name)
c) \s+$ (regexexpression)
d) This field is not allowed to have trailing spaces (violation_message)
e) Please remove all trailing spaces from this field. This means any spaces after the last character. (violation_resolution)

The UI, would contain a list of objects. The user would select the object (organisationunit) and then the property (name) and the rule (foreign key reference to the first table).

a) 1 (foreign key reference to the regex_objects table /object)
b) organisationunit (object/table)
c) name (property)
d) FALSE (negate)
e) TRUE (case_insensitive)
f) NULL locale

I would guess for the translation table, the method would need to be extended somehow, to deal with the locale, which is not present in other objects/tables.

Now, a third UI screen would allow the user to choose the rule(s) they wish to run, similar to the data validation screen. A list of objects that match the rules that the user selects would then be returned to the user. I guess having the ability to print/save this list would be useful. Perhaps it should be brought up as a popup/seperate window to allow the user to resolve each of the violations in turn.

Personally, I think we should start with everything other than the translations table, but I do not actually see much difference. Basically, if the user were to select the translation table, then the locale_code would need to be set in the regex definition.

Making any more sense now? If I could code Java, I would help out, but I am like an old dog, too old to learn new tricks. :slight_smile:

Best regards,
jason

On Fri, Mar 26, 2010 at 10:16 AM, Hieu Dang Duy hieu.hispvietnam@gmail.com wrote:

Dear Jason,

Actually, your password example it is an interesting point. So, I’ve thought quite simple but maybe not in fact. Especially with the locale field. So, I understand now that why you called this field is the extra field.

The functionality of regex validation which is larger and scale than I thought.

I am confusing now. Would u like to give me any suggestion about what should I have set up which key is it ?

Thank you !

On Fri, Mar 26, 2010 at 4:15 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

And the term f ) … How I can forget this one. I am really careless person. DHIS2 is a multilingual software. Thank you so much for reminding me on this issue.

Yes, I did not consider the translation table. I suppose we would need to specify an additional field for the locale in’t this table. For instance, certain regex may apply for English that do not apply for Vietnamese. A good example would be that of the data format field. Different countries have different ways of specifying the dates, so I guess the second table would

Yes, of course or absolutely. But may I ask you a sensitive question? Why don’t you consider to the translation table. As forgot or you really don’t want to?

OK, here is the issue with the translation table as I see it. Again, this may not be the same way as with Java. When I look at for instance certain fields, lets say Organisationunit.shortname, there is no corresponding translation in the translation table. As far as i can tell, these properties are not translatable, yet. Additionally, we might want to use regex validation on other properties that are not translatable, for instance the user password, for example ^(?=.\d)(?=.[a-z])(?=.*[A-Z]).{4,8}$ to give a password that must be at least 4 characters, no more than 8 characters, and must include at least one upper case letter, one lower case letter, and one numeric digit. Now, different implementers might have different restrictions on the type of the type of password they would accept. Another example might be the use of regular expressions to validate identifiers. For instance ^\d{3}-\d{2}-\d{4}$ can be used to validate a Social Security number used in the US. Other countries have different identifiers, so there is a need to be able to validate different identifier types depending on the country.

So, as for the translation table, I would consider this as a separate object really to apply a regex to, but I would not consider it to be the ONLY table that regex validations should apply to.

So, for the second table, I would see it being persisted like this…

a) regexid- 1 (foreign key reference to the regex_objects) table
b) object/table-translation
c) property/field-value

d) FALSE (In this case, we will not negate the expression as we want all fields that DO have trailing spaces.

e) TRUE (With this regex, it does not matter, but we will default to true anyway)

f) locale (en_GB) I think this may be necessary to deal with your next point about translations

To validate a password, the object would look something like this…

a) regexid- 1 (foreign key reference to the regex_objects) table

b) object/table-users

c) property/field-password

d) FALSE (In this case, we will not negate the expression as we want all fields that DO have trailing spaces.

e) FALSE (With this regex, it does, but we will default to true anyway)

f) locale NULL

Now, the case of the password brings up an interesting point, as it is persisted in the DB as a hash. Not really sure how to deal with this one…

Anyway, maybe this is clearer?

Regards,

Jason

I did not really consider the translations so much. I guess there is a need for an additional field. By default, it would be the default locale. The user would then need to construct different regex expressions for different locales, or apply the same regex to several locales. Maybe this extra property would be enough for this? If it is NULL, then the regex would use the base object (organisationunit.name), but if there is a locale stored in the field, then it would use the translation.organisationunit.name property for that particular locale. Does this make sense?

After your suggestion on adding new field is locale. So, I thought that the translation table’s model data which are very very useful for the regex table’s case. Beside, the locale filed I would like to suggest another field is that object’s i likes the translation table.

If so, we will have a composite key as same as the key one in Translation table.
Please have a look at the example is below:

*** Example ***

With regexId : 01

01- (001 - DataElement - name - en_GB) - false - true**===> SAFED**

01- (002 - DataElement - shortname - en_GB) - F - T**===> SAFED**

01- (003 - DataElement - name - vi_VN) - F - T**===> SAFED**

02 - (001 -DataElement - name - en_GB) - false - true**===> UN-SAFED ===> UN-INSERTED**

In this example, with the composite key. We do not allow the action of inserting/updating an other regex for the same object.

How do you think dear Jason?

Looking forward to working on this with you.

Thanks again !

Best regards,
Jason

Jason P. Pickering
email: jason.p.pickering@gmail.com

tel:+260968395190


Hieu.HISPVietnam
Good Health !

Jason P. Pickering
email: jason.p.pickering@gmail.com

tel:+260968395190


Hieu.HISPVietnam
Good Health !

Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

I have been pondering over this over the weekend, and have a few more thoughts here.

The more I think about it, the more I think this is similar to something I have been thinking about for some time, which is the ability to execute passthrough SQL queries on the DHIS2 database. I know that H2, Postgres and MySQL all have support for regular expressions, with slight variations. So, what if we broke this blueprint up into several pieces.

  1. A GUI element to allow users to enter field-level validation with regular expressions, as I described in my previous mail. These regexes could conceivably be reused during data import from external systems (SDMX-HD). Seems pretty simple.

  2. A query executor, that for right now, would execute a select query based on the stuff stored in the regex table, and execute a query against the database. As an example, these two objects would be stored in the database

  3. A regular expression
    a) 1 (or something from the hibernate_sequence) (regexid)

b) No trailing spaces (name)
c) \s+$ (regexexpression)
d) This field is not allowed to have trailing spaces (violation_message)
e) Please remove all trailing spaces from this field. This means any spaces after the last character. (violation_resolution)

  1. What the regular expression should operate on
    a) 1 (foreign key reference to the regex_objects table /object)
    b) organisationunit (object/table)
    c) name (property)
    d) FALSE (negate)
    e) TRUE (case_insensitive)
    f) NULL locale

Java can work some magic here, and convert this into an SQL query for the particular DB system we are using…

SELECT * FROM organisationunit where name ~(‘\s+$’) for postgres

In MySQL this would become…

SELECT * FROM organisationunit WHERE name REGEXP ‘\s$’

I think it is possible in H2, but no idea what the grammar is.

Some sort of list (name, shortname, some other fields) would be returned to the user along with instructions (1.e) about how to resolve it.

There could be many more uses for a passthrough SQL query executor, thus it might be useful for constructing it in a way to make in generic to execute arbitrary SELECT statements. I will not address this in this thread, but perhaps a separate blueprint.

Thoughts?

Regards,
Jason

···

On Fri, Mar 26, 2010 at 3:24 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

I thought I should forward this conversation Hieu and I have been having for the benefit of the group. Perhaps there are others that have suggestions/comments/rants about how we should go about field level validation with regular expressions?

Regards,
Jason

---------- Forwarded message ----------
From: Jason Pickering jason.p.pickering@gmail.com

Date: Fri, Mar 26, 2010 at 10:31 AM
Subject: Re: About Regex-Validation blueprint
To: Hieu Dang Duy hieu.hispvietnam@gmail.com

Hi Hieu,

Well, we can always start simple, and go from there, but it is important that we architect everything properly.

So, I suggest that for everything out of the translation table, we use something like the data validation procedures. A user enter a rules, and then runs a check. Eventually, this could be extended to either the UI validation. But, my thinking was similar to that of DHIS 1.4, which allows data validation rules to be defined in terms of SQL, and then run by the user during a validation check.

So, lets try a a full example. I have a situation now, where I have organisationunit short names, with trailing spaces.

These there would need to be a UI screen to define the rule itself.
a) 1 (or something from the hibernate_sequence) (regexid)

b) No trailing spaces (name)
c) \s+$ (regexexpression)
d) This field is not allowed to have trailing spaces (violation_message)
e) Please remove all trailing spaces from this field. This means any spaces after the last character. (violation_resolution)

The UI, would contain a list of objects. The user would select the object (organisationunit) and then the property (name) and the rule (foreign key reference to the first table).

a) 1 (foreign key reference to the regex_objects table /object)
b) organisationunit (object/table)
c) name (property)
d) FALSE (negate)
e) TRUE (case_insensitive)
f) NULL locale

I would guess for the translation table, the method would need to be extended somehow, to deal with the locale, which is not present in other objects/tables.

Now, a third UI screen would allow the user to choose the rule(s) they wish to run, similar to the data validation screen. A list of objects that match the rules that the user selects would then be returned to the user. I guess having the ability to print/save this list would be useful. Perhaps it should be brought up as a popup/seperate window to allow the user to resolve each of the violations in turn.

Personally, I think we should start with everything other than the translations table, but I do not actually see much difference. Basically, if the user were to select the translation table, then the locale_code would need to be set in the regex definition.

Making any more sense now? If I could code Java, I would help out, but I am like an old dog, too old to learn new tricks. :slight_smile:

Best regards,
jason

On Fri, Mar 26, 2010 at 10:16 AM, Hieu Dang Duy hieu.hispvietnam@gmail.com wrote:

Dear Jason,

Actually, your password example it is an interesting point. So, I’ve thought quite simple but maybe not in fact. Especially with the locale field. So, I understand now that why you called this field is the extra field.

The functionality of regex validation which is larger and scale than I thought.

I am confusing now. Would u like to give me any suggestion about what should I have set up which key is it ?

Thank you !

On Fri, Mar 26, 2010 at 4:15 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

And the term f ) … How I can forget this one. I am really careless person. DHIS2 is a multilingual software. Thank you so much for reminding me on this issue.

Yes, I did not consider the translation table. I suppose we would need to specify an additional field for the locale in’t this table. For instance, certain regex may apply for English that do not apply for Vietnamese. A good example would be that of the data format field. Different countries have different ways of specifying the dates, so I guess the second table would

Yes, of course or absolutely. But may I ask you a sensitive question? Why don’t you consider to the translation table. As forgot or you really don’t want to?

OK, here is the issue with the translation table as I see it. Again, this may not be the same way as with Java. When I look at for instance certain fields, lets say Organisationunit.shortname, there is no corresponding translation in the translation table. As far as i can tell, these properties are not translatable, yet. Additionally, we might want to use regex validation on other properties that are not translatable, for instance the user password, for example ^(?=.\d)(?=.[a-z])(?=.*[A-Z]).{4,8}$ to give a password that must be at least 4 characters, no more than 8 characters, and must include at least one upper case letter, one lower case letter, and one numeric digit. Now, different implementers might have different restrictions on the type of the type of password they would accept. Another example might be the use of regular expressions to validate identifiers. For instance ^\d{3}-\d{2}-\d{4}$ can be used to validate a Social Security number used in the US. Other countries have different identifiers, so there is a need to be able to validate different identifier types depending on the country.

So, as for the translation table, I would consider this as a separate object really to apply a regex to, but I would not consider it to be the ONLY table that regex validations should apply to.

So, for the second table, I would see it being persisted like this…

a) regexid- 1 (foreign key reference to the regex_objects) table
b) object/table-translation
c) property/field-value

d) FALSE (In this case, we will not negate the expression as we want all fields that DO have trailing spaces.

e) TRUE (With this regex, it does not matter, but we will default to true anyway)

f) locale (en_GB) I think this may be necessary to deal with your next point about translations

To validate a password, the object would look something like this…

a) regexid- 1 (foreign key reference to the regex_objects) table

b) object/table-users

c) property/field-password

d) FALSE (In this case, we will not negate the expression as we want all fields that DO have trailing spaces.

e) FALSE (With this regex, it does, but we will default to true anyway)

f) locale NULL

Now, the case of the password brings up an interesting point, as it is persisted in the DB as a hash. Not really sure how to deal with this one…

Anyway, maybe this is clearer?

Regards,

Jason

I did not really consider the translations so much. I guess there is a need for an additional field. By default, it would be the default locale. The user would then need to construct different regex expressions for different locales, or apply the same regex to several locales. Maybe this extra property would be enough for this? If it is NULL, then the regex would use the base object (organisationunit.name), but if there is a locale stored in the field, then it would use the translation.organisationunit.name property for that particular locale. Does this make sense?

After your suggestion on adding new field is locale. So, I thought that the translation table’s model data which are very very useful for the regex table’s case. Beside, the locale filed I would like to suggest another field is that object’s i likes the translation table.

If so, we will have a composite key as same as the key one in Translation table.
Please have a look at the example is below:

*** Example ***

With regexId : 01

01- (001 - DataElement - name - en_GB) - false - true**===> SAFED**

01- (002 - DataElement - shortname - en_GB) - F - T**===> SAFED**

01- (003 - DataElement - name - vi_VN) - F - T**===> SAFED**

02 - (001 -DataElement - name - en_GB) - false - true**===> UN-SAFED ===> UN-INSERTED**

In this example, with the composite key. We do not allow the action of inserting/updating an other regex for the same object.

How do you think dear Jason?

Looking forward to working on this with you.

Thanks again !

Best regards,
Jason

Jason P. Pickering
email: jason.p.pickering@gmail.com

tel:+260968395190


Hieu.HISPVietnam
Good Health !

Jason P. Pickering
email: jason.p.pickering@gmail.com

tel:+260968395190


Hieu.HISPVietnam
Good Health !

Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

I have been pondering over this over the weekend, and have a few more thoughts here.

The more I think about it, the more I think this is similar to something I have been thinking about for some time, which is the ability to execute passthrough SQL queries on the DHIS2 database. I know that H2, Postgres and MySQL all have support for regular expressions, with slight variations. So, what if we broke this blueprint up into several pieces.

Yes, me too. But too bad I could not think out any thing. Anyway, I thought that let’s do this in a very simple step first. Maybe, a blueprint regex validation for testing first which the same of validation rule.

  1. A GUI element to allow users to enter field-level validation with regular expressions, as I described in my previous mail. These regexes could conceivably be reused during data import from external systems (SDMX-HD). Seems pretty simple.

Sorry Jason,

I am still not un-clearly about this level field. So, could you explain me more about this one. Is it a new field in the Regex table or the referencing table (the second one)? What is it using for ? I would like to see some of examples with this field, can you show up them, please ?

  1. A query executor, that for right now, would execute a select query based on the stuff stored in the regex table, and execute a query against the database. As an example, these two objects would be stored in the database

  2. A regular expression

a) 1 (or something from the hibernate_sequence) (regexid)

b) No trailing spaces (name)
c) \s+$ (regexexpression)
d) This field is not allowed to have trailing spaces (violation_message)
e) Please remove all trailing spaces from this field. This means any spaces after the last character. (violation_resolution)

  1. What the regular expression should operate on

a) 1 (foreign key reference to the regex_objects table /object)
b) organisationunit (object/table)
c) name (property)
d) FALSE (negate)
e) TRUE (case_insensitive)
f) NULL locale

Java can work some magic here, and convert this into an SQL query for the particular DB system we are using…

SELECT * FROM organisationunit where name ~(‘\s+$’) for postgres

In MySQL this would become…

SELECT * FROM organisationunit WHERE name REGEXP ‘\s$’

I think it is possible in H2, but no idea what the grammar is.

Yes, I think we can do this later. Maybe should not too worry about this.

Some sort of list (name, shortname, some other fields) would be returned to the user along with instructions (1.e) about how to resolve it.

There could be many more uses for a passthrough SQL query executor, thus it might be useful for constructing it in a way to make in generic to execute arbitrary SELECT statements. I will not address this in this thread, but perhaps a separate blueprint.

As I said which thing is above, we just do it as testing blueprint first to jump in the next one.

···

On Mon, Mar 29, 2010 at 3:08 AM, Jason Pickering jason.p.pickering@gmail.com wrote:

Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190


Hieu.HISPVietnam
Good Health !

OK, let me try and explain. Remember, I think in terms of databases, you think in terms of Java, so I think we are getting tripped up on the vocabulary. :slight_smile:

Basically, what I mean by “field level validation” is the ability to check an individual field in a database table/object. So, in the example that I have been using, organisationunit is the table, and “name” is the field to be checked with. I guess in Java terms this would correspond to an object and a property.

So, again, I have been envisioning this functioning as a field/property level validation function, similar to the data integrity checks. If you look at DHIS 1.4, there are data integrity checks stored as SQL statements in the database, which allows flexible definition of data integrity rules. What I am envisioning is similar to this functionality, but allowing users the ability to define a regular expression and then a field/property to match it against.

Hope this makes this more clear?

Regards,
Jason

···

On Mon, Mar 29, 2010 at 6:37 AM, Hieu Dang Duy hieu.hispvietnam@gmail.com wrote:

On Mon, Mar 29, 2010 at 3:08 AM, Jason Pickering jason.p.pickering@gmail.com wrote:

I have been pondering over this over the weekend, and have a few more thoughts here.

The more I think about it, the more I think this is similar to something I have been thinking about for some time, which is the ability to execute passthrough SQL queries on the DHIS2 database. I know that H2, Postgres and MySQL all have support for regular expressions, with slight variations. So, what if we broke this blueprint up into several pieces.

Yes, me too. But too bad I could not think out any thing. Anyway, I thought that let’s do this in a very simple step first. Maybe, a blueprint regex validation for testing first which the same of validation rule.

  1. A GUI element to allow users to enter field-level validation with regular expressions, as I described in my previous mail. These regexes could conceivably be reused during data import from external systems (SDMX-HD). Seems pretty simple.

Sorry Jason,

I am still not un-clearly about this level field. So, could you explain me more about this one. Is it a new field in the Regex table or the referencing table (the second one)? What is it using for ? I would like to see some of examples with this field, can you show up them, please ?

  1. A query executor, that for right now, would execute a select query based on the stuff stored in the regex table, and execute a query against the database. As an example, these two objects would be stored in the database

  2. A regular expression

a) 1 (or something from the hibernate_sequence) (regexid)

b) No trailing spaces (name)
c) \s+$ (regexexpression)
d) This field is not allowed to have trailing spaces (violation_message)
e) Please remove all trailing spaces from this field. This means any spaces after the last character. (violation_resolution)

  1. What the regular expression should operate on

a) 1 (foreign key reference to the regex_objects table /object)
b) organisationunit (object/table)
c) name (property)
d) FALSE (negate)
e) TRUE (case_insensitive)
f) NULL locale

Java can work some magic here, and convert this into an SQL query for the particular DB system we are using…

SELECT * FROM organisationunit where name ~(‘\s+$’) for postgres

In MySQL this would become…

SELECT * FROM organisationunit WHERE name REGEXP ‘\s$’

I think it is possible in H2, but no idea what the grammar is.

Yes, I think we can do this later. Maybe should not too worry about this.

Some sort of list (name, shortname, some other fields) would be returned to the user along with instructions (1.e) about how to resolve it.

There could be many more uses for a passthrough SQL query executor, thus it might be useful for constructing it in a way to make in generic to execute arbitrary SELECT statements. I will not address this in this thread, but perhaps a separate blueprint.

As I said which thing is above, we just do it as testing blueprint first to jump in the next one.

Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190


Hieu.HISPVietnam
Good Health !

Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

Yes, I understood now. Really thanks for your enthusiasm in explaining.
However, I have never seen or used DHIS1.4. But is that “Data integrity checks” functionality in DHIS1.4 which is the same of “Data integrity” in DHIS2.0 ?

···

On Mon, Mar 29, 2010 at 1:45 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

OK, let me try and explain. Remember, I think in terms of databases, you think in terms of Java, so I think we are getting tripped up on the vocabulary. :slight_smile:

Basically, what I mean by “field level validation” is the ability to check an individual field in a database table/object. So, in the example that I have been using, organisationunit is the table, and “name” is the field to be checked with. I guess in Java terms this would correspond to an object and a property.

So, again, I have been envisioning this functioning as a field/property level validation function, similar to the data integrity checks. If you look at DHIS 1.4, there are data integrity checks stored as SQL statements in the database, which allows flexible definition of data integrity rules. What I am envisioning is similar to this functionality, but allowing users the ability to define a regular expression and then a field/property to match it against.

Hope this makes this more clear?

Regards,
Jason


Jason P. Pickering
email: jason.p.pickering@gmail.com

tel:+260968395190


Hieu.HISPVietnam
Good Health !

Hi again,

If you want, you can download a copy of DHIS 1.4 from hispkerala.org and take a look at the DataIntegrityCheck table. For instance, the rule “OrgUnit Group Set Exclusive Violation”

is defined as

SELECT OrgUnit.OrgUnitName AS OrgUnit, OrgUnitGroupSet.OrgUnitGroupSetName AS GroupSet, Count(OrgUnitGroupSet.OrgUnitGroupSetID) AS [Group Set Allocations]
FROM OrgUnit INNER JOIN (OrgUnitGroupSet INNER JOIN (OrgUnitGroupSetMember INNER JOIN OrgUnitGroupMember ON OrgUnitGroupSetMember.OrgUnitGroupID = OrgUnitGroupMember.OrgUnitGroupID) ON OrgUnitGroupSet.OrgUnitGroupSetID = OrgUnitGroupSetMember.OrgUnitGroupSetID) ON OrgUnit.OrgUnitID = OrgUnitGroupMember.OrgUnitID

WHERE OrgUnitGroupSet.OrgUnitGroupSetExclusive=1 AND OrgUnitGroupMember.Active = 1
GROUP BY OrgUnit.OrgUnitName, OrgUnitGroupSet.OrgUnitGroupSetName
HAVING Count(OrgUnitGroupSet.OrgUnitGroupSetID)>1
ORDER BY OrgUnit.OrgUnitName;

The data integrity check functionality in 1.4 is essentially exactly the same as DHIS2, execept it is possible to extend the data integrity checks by adding new SQL statements in the 1.4 database. I would really like this see this feature in 2.0, as right now, all of the rules have been “hard-coded” in Java, and are not extendible, except by modification of the source code (Lars correct me if I am wrong). I expect that we could do it the 1.4 way, by using ANSI compatible queries that would work across Postgres, MySQL and H2. Anyway, that is another blueprint I think.

Regards,
Jason

···

On Mon, Mar 29, 2010 at 9:33 AM, Hieu Dang Duy hieu.hispvietnam@gmail.com wrote:

Yes, I understood now. Really thanks for your enthusiasm in explaining.
However, I have never seen or used DHIS1.4. But is that “Data integrity checks” functionality in DHIS1.4 which is the same of “Data integrity” in DHIS2.0 ?

On Mon, Mar 29, 2010 at 1:45 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

OK, let me try and explain. Remember, I think in terms of databases, you think in terms of Java, so I think we are getting tripped up on the vocabulary. :slight_smile:

Basically, what I mean by “field level validation” is the ability to check an individual field in a database table/object. So, in the example that I have been using, organisationunit is the table, and “name” is the field to be checked with. I guess in Java terms this would correspond to an object and a property.

So, again, I have been envisioning this functioning as a field/property level validation function, similar to the data integrity checks. If you look at DHIS 1.4, there are data integrity checks stored as SQL statements in the database, which allows flexible definition of data integrity rules. What I am envisioning is similar to this functionality, but allowing users the ability to define a regular expression and then a field/property to match it against.

Hope this makes this more clear?

Regards,
Jason


Jason P. Pickering
email: jason.p.pickering@gmail.com

tel:+260968395190


Hieu.HISPVietnam
Good Health !

Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

OK, here is the re-construction of the statement query instance which is modified for who wanna execute it following DHIS2’s contruction in PostgresQL.

***SELECT OrgUnit.Name AS OrgUnit, OrgUnitGroupSet.Name AS GroupSet, Count(OrgUnitGroupSet.OrgUnitGroupSetID) AS “Group Set Allocations”

FROM OrganisationUnit AS
OrgUnit INNER JOIN (
OrgUnitGroupSet INNER JOIN (
OrgUnitGroupSetMembers INNER JOIN
OrgUnitGroupMembers
ON OrgUnitGroupSetMembers.OrgUnitGroupID = OrgUnitGroupMembers.OrgUnitGroupID)

       ON OrgUnitGroupSet.OrgUnitGroupSetID = OrgUnitGroupSetMembers.OrgUnitGroupSetID)
       ON OrgUnit.OrganisationUnitID = OrgUnitGroupMembers.OrganisationUnitID

WHERE OrgUnitGroupSet.Exclusive = true

GROUP BY OrgUnit.Name, OrgUnitGroupSet.Name
HAVING Count(OrgUnitGroupSet.OrgUnitGroupSetID) > 1
ORDER BY OrgUnit.Name;*

···

**
On Mon, Mar 29, 2010 at 2:44 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi again,

If you want, you can download a copy of DHIS 1.4 from hispkerala.org and take a look at the DataIntegrityCheck table. For instance, the rule “OrgUnit Group Set Exclusive Violation”

is defined as

SELECT OrgUnit.OrgUnitName AS OrgUnit, OrgUnitGroupSet.OrgUnitGroupSetName AS GroupSet, Count(OrgUnitGroupSet.OrgUnitGroupSetID) AS [Group Set Allocations]
FROM OrgUnit INNER JOIN (OrgUnitGroupSet INNER JOIN (OrgUnitGroupSetMember INNER JOIN OrgUnitGroupMember ON OrgUnitGroupSetMember.OrgUnitGroupID = OrgUnitGroupMember.OrgUnitGroupID) ON OrgUnitGroupSet.OrgUnitGroupSetID = OrgUnitGroupSetMember.OrgUnitGroupSetID) ON OrgUnit.OrgUnitID = OrgUnitGroupMember.OrgUnitID

WHERE OrgUnitGroupSet.OrgUnitGroupSetExclusive=1 AND OrgUnitGroupMember.Active = 1
GROUP BY OrgUnit.OrgUnitName, OrgUnitGroupSet.OrgUnitGroupSetName
HAVING Count(OrgUnitGroupSet.OrgUnitGroupSetID)>1
ORDER BY OrgUnit.OrgUnitName;

The data integrity check functionality in 1.4 is essentially exactly the same as DHIS2, execept it is possible to extend the data integrity checks by adding new SQL statements in the 1.4 database. I would really like this see this feature in 2.0, as right now, all of the rules have been “hard-coded” in Java, and are not extendible, except by modification of the source code (Lars correct me if I am wrong). I expect that we could do it the 1.4 way, by using ANSI compatible queries that would work across Postgres, MySQL and H2. Anyway, that is another blueprint I think.

Regards,
Jason


Jason P. Pickering
email: jason.p.pickering@gmail.com

tel:+260968395190


Hieu.HISPVietnam
Good Health !

Yes, exactly. :slight_smile:

But not to get distracted…

What I wanted to highlight was that the DHIS 1.4 data integrity checks functionality allows predefined queries to be stored in a table, thus making the data integrity checks extendable.

What I am proposing with the regular expression validation would be the dynamic contsruction and execution of queries, based on the contents of those two objects/tables we have been discussing.

Java would be used to construct the query dynamically based on the the database system being used. Postgres, MySQL and H2 all have slightly different syntax when it comes to regular expressions, which are not ANSI SQL compatible anyway. Hibernate, as far as I can tell, does not support regular expressions directly, so I think a pass through query would need to be used in this case.

···

On Mon, Mar 29, 2010 at 10:20 AM, Hieu Dang Duy hieu.hispvietnam@gmail.com wrote:

OrgUnit INNER JOIN (
OrgUnitGroupSet INNER JOIN (

HAVING Count(OrgUnitGroupSet.OrgUnitGroupSetID) > 1

OK, here is the re-construction of the statement query instance which is modified for who wanna execute it following DHIS2’s contruction in PostgresQL.

***SELECT OrgUnit.Name AS OrgUnit, OrgUnitGroupSet.Name AS GroupSet, Count(OrgUnitGroupSet.OrgUnitGroupSetID) AS “Group Set Allocations”

FROM OrganisationUnit AS
OrgUnitGroupSetMembers INNER JOIN
OrgUnitGroupMembers
ON OrgUnitGroupSetMembers.OrgUnitGroupID = OrgUnitGroupMembers.OrgUnitGroupID)

       ON OrgUnitGroupSet.OrgUnitGroupSetID = OrgUnitGroupSetMembers.OrgUnitGroupSetID)
       ON OrgUnit.OrganisationUnitID = OrgUnitGroupMembers.OrganisationUnitID

WHERE OrgUnitGroupSet.Exclusive = true

GROUP BY OrgUnit.Name, OrgUnitGroupSet.NameORDER BY OrgUnit.Name;*
**
On Mon, Mar 29, 2010 at 2:44 PM, Jason Pickering jason.p.pickering@gmail.com wrote:

Hi again,

If you want, you can download a copy of DHIS 1.4 from hispkerala.org and take a look at the DataIntegrityCheck table. For instance, the rule “OrgUnit Group Set Exclusive Violation”

is defined as

SELECT OrgUnit.OrgUnitName AS OrgUnit, OrgUnitGroupSet.OrgUnitGroupSetName AS GroupSet, Count(OrgUnitGroupSet.OrgUnitGroupSetID) AS [Group Set Allocations]
FROM OrgUnit INNER JOIN (OrgUnitGroupSet INNER JOIN (OrgUnitGroupSetMember INNER JOIN OrgUnitGroupMember ON OrgUnitGroupSetMember.OrgUnitGroupID = OrgUnitGroupMember.OrgUnitGroupID) ON OrgUnitGroupSet.OrgUnitGroupSetID = OrgUnitGroupSetMember.OrgUnitGroupSetID) ON OrgUnit.OrgUnitID = OrgUnitGroupMember.OrgUnitID

WHERE OrgUnitGroupSet.OrgUnitGroupSetExclusive=1 AND OrgUnitGroupMember.Active = 1
GROUP BY OrgUnit.OrgUnitName, OrgUnitGroupSet.OrgUnitGroupSetName
HAVING Count(OrgUnitGroupSet.OrgUnitGroupSetID)>1
ORDER BY OrgUnit.OrgUnitName;

The data integrity check functionality in 1.4 is essentially exactly the same as DHIS2, execept it is possible to extend the data integrity checks by adding new SQL statements in the 1.4 database. I would really like this see this feature in 2.0, as right now, all of the rules have been “hard-coded” in Java, and are not extendible, except by modification of the source code (Lars correct me if I am wrong). I expect that we could do it the 1.4 way, by using ANSI compatible queries that would work across Postgres, MySQL and H2. Anyway, that is another blueprint I think.

Regards,
Jason


Jason P. Pickering
email: jason.p.pickering@gmail.com

tel:+260968395190


Hieu.HISPVietnam
Good Health !

Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190