DHIS2 with R

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

···

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

We've talked before about integrating scripting engine (such as R)
into dhis : rscript - Java to R scripting interface - RForge.net

But my guess is that most R users are going to be of a level of
sophistication that they would be most comfortable doing the kind of
thing you describe - conecting directly to db with r client and doing
their stuff.

OTOH if there were sufficiently useful "canned" dhis R scripts which
could take some number crunching load off the jvm and produce canned
useful analysis then that would be different.

Sadly I don't know sufficient about R to know. But I sense it ...

Regards
Bob

···

On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

Hi Bob,

Yes, I suspect that most R users would probably want to do things
their own way. It has a rather steep learning curve. :slight_smile:

As for canned R scripts, the best way would probably with with PL/R, a
procedural Postgresql language which utilizes R.

http://www.joeconway.com/plr/doc/index.html

I have done some very basic testing and it seems to work just fine on
the server side.

I think they are two separate problems really, but I totally agree, C
is likely going to be faster than Java for big operations. However, I
do think (as all of you know) that the use of stored procedures (with
the wrapper facade type of approach) for certain functions (like
aggregation and heavy cross tab operations) would be much better to be
executed on the database server as a native stored procedure.

Regards,
Jason

···

On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

We've talked before about integrating scripting engine (such as R)
into dhis : rscript - Java to R scripting interface - RForge.net

But my guess is that most R users are going to be of a level of
sophistication that they would be most comfortable doing the kind of
thing you describe - conecting directly to db with r client and doing
their stuff.

OTOH if there were sufficiently useful "canned" dhis R scripts which
could take some number crunching load off the jvm and produce canned
useful analysis then that would be different.

Sadly I don't know sufficient about R to know. But I sense it ...

Regards
Bob

On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

We've talked before about integrating scripting engine (such as R)
into dhis : rscript - Java to R scripting interface - RForge.net

But my guess is that most R users are going to be of a level of
sophistication that they would be most comfortable doing the kind of
thing you describe - conecting directly to db with r client and doing
their stuff.

OTOH if there were sufficiently useful "canned" dhis R scripts which
could take some number crunching load off the jvm and produce canned
useful analysis then that would be different.

In fact Ola and I earlier had discussions with someone who had
integrated R scripts in his own Access based system, but would have
been delighted to switch to DHIS2 as the platform if we could
accommodate the scripts.

And it would be a way to quite quickly enhance the analytical
capabilities, if DHIS2 came with a set of good scripts - perhaps after
being refined and generalized through the more ad hoc process Jason
describes. And one could imagine most of the "plumbing tasks" could
have been taken care of, leaving a simpler and more efficient
interface for a wider audience.

Knut

···

On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

Sadly I don't know sufficient about R to know. But I sense it ...

Regards
Bob

On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
Cheers,
Knut Staring

Hi Bob,

Yes, I suspect that most R users would probably want to do things
their own way. It has a rather steep learning curve. :slight_smile:

As for canned R scripts, the best way would probably with with PL/R, a
procedural Postgresql language which utilizes R.

http://www.joeconway.com/plr/doc/index.html

I have done some very basic testing and it seems to work just fine on
the server side.

Swings and roundabouts to a certain extent. The main thing is that
the r scripts are evaluated using the r c library. If they were
invoked from within java/dhis then I guess data access would be slower
than from pl/r (we'd need to have a way to get the data to the r
interpreter), but number crunching would be similar and would also
work with mysql and friends. Not sure which of these are bigger
problems in typical/possible scenarios.

···

On 27 May 2010 11:15, Jason Pickering <jason.p.pickering@gmail.com> wrote:

I think they are two separate problems really, but I totally agree, C
is likely going to be faster than Java for big operations. However, I
do think (as all of you know) that the use of stored procedures (with
the wrapper facade type of approach) for certain functions (like
aggregation and heavy cross tab operations) would be much better to be
executed on the database server as a native stored procedure.

Regards,
Jason

On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

We've talked before about integrating scripting engine (such as R)
into dhis : rscript - Java to R scripting interface - RForge.net

But my guess is that most R users are going to be of a level of
sophistication that they would be most comfortable doing the kind of
thing you describe - conecting directly to db with r client and doing
their stuff.

OTOH if there were sufficiently useful "canned" dhis R scripts which
could take some number crunching load off the jvm and produce canned
useful analysis then that would be different.

Sadly I don't know sufficient about R to know. But I sense it ...

Regards
Bob

On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

OpenXdata is looking at R integration over Hibernate, aiming to have a
web based analytical tool similar to RCmdr/SPSS:

"RCmdr is an R package that provides a user interface which is quite
similar to SPSS, perhaps the most popular package used for statistical
analysis by study designers and public health researchers. RCmdr
provides functionality that broadly mimics that of SPSS."

http://code.zegeba.org/openxdata/raw-attachment/wiki/Statistical_package/Analytics%20pres.pdf
http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/

Hopefully, some of this work can be leveraged for DHIS2 later.

Knut

···

On Thu, May 27, 2010 at 12:26 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

On 27 May 2010 11:15, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi Bob,

Yes, I suspect that most R users would probably want to do things
their own way. It has a rather steep learning curve. :slight_smile:

As for canned R scripts, the best way would probably with with PL/R, a
procedural Postgresql language which utilizes R.

http://www.joeconway.com/plr/doc/index.html

I have done some very basic testing and it seems to work just fine on
the server side.

Swings and roundabouts to a certain extent. The main thing is that
the r scripts are evaluated using the r c library. If they were
invoked from within java/dhis then I guess data access would be slower
than from pl/r (we'd need to have a way to get the data to the r
interpreter), but number crunching would be similar and would also
work with mysql and friends. Not sure which of these are bigger
problems in typical/possible scenarios.

I think they are two separate problems really, but I totally agree, C
is likely going to be faster than Java for big operations. However, I
do think (as all of you know) that the use of stored procedures (with
the wrapper facade type of approach) for certain functions (like
aggregation and heavy cross tab operations) would be much better to be
executed on the database server as a native stored procedure.

Regards,
Jason

On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

We've talked before about integrating scripting engine (such as R)
into dhis : rscript - Java to R scripting interface - RForge.net

But my guess is that most R users are going to be of a level of
sophistication that they would be most comfortable doing the kind of
thing you describe - conecting directly to db with r client and doing
their stuff.

OTOH if there were sufficiently useful "canned" dhis R scripts which
could take some number crunching load off the jvm and produce canned
useful analysis then that would be different.

Sadly I don't know sufficient about R to know. But I sense it ...

Regards
Bob

On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
Cheers,
Knut Staring

Yeah, this I guess comes back time and time again, with my some what
uncomfortable relationship with Hibernate and Java. Clearly, we need
to think about how to make certain procedures crossplatform compatible
(cross platform in the sense of working between Postgres/MySQL and
other DBs) with the need to offer advanced analysis capabilities, with
acceptable performance.

There could be multiple ways of doing it, but in the absense of having
R integrated into DHIS2, I think the most likely shorterm use case
would be just some documentation on how to use the R client with the
DHIS2 database. Perhaps those users that use R over time with DHIS2
could contribute their procedures, which should be able to be
generalized either with PL/R.

Of course the difference with using Postgres, is that R procedures can
be embedded as a new language inside the DB. I am not really sure this
is possible with MySQL. This of course reduces the internal overhead
of getting the data out of Postgres, through Java, and into the R
interpreter, but I am not sure really what the impact of this might be
without testing it.

···

On Thu, May 27, 2010 at 12:26 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

On 27 May 2010 11:15, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi Bob,

Yes, I suspect that most R users would probably want to do things
their own way. It has a rather steep learning curve. :slight_smile:

As for canned R scripts, the best way would probably with with PL/R, a
procedural Postgresql language which utilizes R.

http://www.joeconway.com/plr/doc/index.html

I have done some very basic testing and it seems to work just fine on
the server side.

Swings and roundabouts to a certain extent. The main thing is that
the r scripts are evaluated using the r c library. If they were
invoked from within java/dhis then I guess data access would be slower
than from pl/r (we'd need to have a way to get the data to the r
interpreter), but number crunching would be similar and would also
work with mysql and friends. Not sure which of these are bigger
problems in typical/possible scenarios.

I think they are two separate problems really, but I totally agree, C
is likely going to be faster than Java for big operations. However, I
do think (as all of you know) that the use of stored procedures (with
the wrapper facade type of approach) for certain functions (like
aggregation and heavy cross tab operations) would be much better to be
executed on the database server as a native stored procedure.

Regards,
Jason

On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

We've talked before about integrating scripting engine (such as R)
into dhis : rscript - Java to R scripting interface - RForge.net

But my guess is that most R users are going to be of a level of
sophistication that they would be most comfortable doing the kind of
thing you describe - conecting directly to db with r client and doing
their stuff.

OTOH if there were sufficiently useful "canned" dhis R scripts which
could take some number crunching load off the jvm and produce canned
useful analysis then that would be different.

Sadly I don't know sufficient about R to know. But I sense it ...

Regards
Bob

On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

My concern would be DB performance, there's no telling what kind of locks R or any other product using odbc/jdbc is going to use. I'm already worried about simultaneous transactional and reporting use. Have there been any large-volume performance tests? Has any thought been given to splitting reporting and data entry between different DB servers? I know everyone has been focused on getting the distributed DB aspects right, but assuming universal availability of internet, how would DHIS2 perform on a single (possibly clustered) national DB server?

···

-----Original Message-----
From: dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net [mailto:dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net] On Behalf Of Jason Pickering
Sent: Thursday, May 27, 2010 7:03 AM
To: Bob Jolliffe
Cc: dhis2-users@lists.launchpad.net; dhis2-devs
Subject: Re: [Dhis2-users] [Dhis2-devs] DHIS2 with R

Yeah, this I guess comes back time and time again, with my some what
uncomfortable relationship with Hibernate and Java. Clearly, we need
to think about how to make certain procedures crossplatform compatible
(cross platform in the sense of working between Postgres/MySQL and
other DBs) with the need to offer advanced analysis capabilities, with
acceptable performance.

There could be multiple ways of doing it, but in the absense of having
R integrated into DHIS2, I think the most likely shorterm use case
would be just some documentation on how to use the R client with the
DHIS2 database. Perhaps those users that use R over time with DHIS2
could contribute their procedures, which should be able to be
generalized either with PL/R.

Of course the difference with using Postgres, is that R procedures can
be embedded as a new language inside the DB. I am not really sure this
is possible with MySQL. This of course reduces the internal overhead
of getting the data out of Postgres, through Java, and into the R
interpreter, but I am not sure really what the impact of this might be
without testing it.

On Thu, May 27, 2010 at 12:26 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

On 27 May 2010 11:15, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi Bob,

Yes, I suspect that most R users would probably want to do things
their own way. It has a rather steep learning curve. :slight_smile:

As for canned R scripts, the best way would probably with with PL/R, a
procedural Postgresql language which utilizes R.

http://www.joeconway.com/plr/doc/index.html

I have done some very basic testing and it seems to work just fine on
the server side.

Swings and roundabouts to a certain extent. The main thing is that
the r scripts are evaluated using the r c library. If they were
invoked from within java/dhis then I guess data access would be slower
than from pl/r (we'd need to have a way to get the data to the r
interpreter), but number crunching would be similar and would also
work with mysql and friends. Not sure which of these are bigger
problems in typical/possible scenarios.

I think they are two separate problems really, but I totally agree, C
is likely going to be faster than Java for big operations. However, I
do think (as all of you know) that the use of stored procedures (with
the wrapper facade type of approach) for certain functions (like
aggregation and heavy cross tab operations) would be much better to be
executed on the database server as a native stored procedure.

Regards,
Jason

On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

We've talked before about integrating scripting engine (such as R)
into dhis : rscript - Java to R scripting interface - RForge.net

But my guess is that most R users are going to be of a level of
sophistication that they would be most comfortable doing the kind of
thing you describe - conecting directly to db with r client and doing
their stuff.

OTOH if there were sufficiently useful "canned" dhis R scripts which
could take some number crunching load off the jvm and produce canned
useful analysis then that would be different.

Sadly I don't know sufficient about R to know. But I sense it ...

Regards
Bob

On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

Hi Roger,

Valid concerns, but I would assume that the typical use case for R
would be that the user would typically only be looking at 1) a view
that has been prepared for them or 2) be provided with read only
access to selected tables. I would not expect that locks would be a
problem in this case.

However, your point is well taken. It is therefore I have been
tinkering around with luciddb, a column-oriented database, that may be
more appropriate for analysis. Which database that should be used is
probably a discussion, but the point is that a separation between the
"transactional" database that is being used for data entry, and the
"analysis" database is a good idea. I would regard this to probably
outside the scope of what DHIS is really intended to do. If people
need to use tools like R, they will likely as well of being capable of
coming up with their own solution.

However, this does not exclude that certain simple examples could be
built into DHIS. Obviously performance is a concern, but of course, it
depends on what you are trying to do. R is incredibly powerful when it
comes to producing graphics as I am sure that you are aware, and
lightyears ahead of the other components we are using (jPlot I think).
So, I would think that the typical use case would be to leverage R,
possibly as an extension to DHIS2 for those that need it, for the
generation of analysis tables and graphics, that would be beyond the
scope of the "basic" package, which is really limited to aggregation.

Anyway, just a few more thoughts.

Regards,
Jason

···

On Thu, May 27, 2010 at 3:39 PM, Friedman, Roger (CDC/OID/NCHHSTP) (CTR) <rdf4@cdc.gov> wrote:

My concern would be DB performance, there's no telling what kind of locks R or any other product using odbc/jdbc is going to use. I'm already worried about simultaneous transactional and reporting use. Have there been any large-volume performance tests? Has any thought been given to splitting reporting and data entry between different DB servers? I know everyone has been focused on getting the distributed DB aspects right, but assuming universal availability of internet, how would DHIS2 perform on a single (possibly clustered) national DB server?

-----Original Message-----
From: dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net [mailto:dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net] On Behalf Of Jason Pickering
Sent: Thursday, May 27, 2010 7:03 AM
To: Bob Jolliffe
Cc: dhis2-users@lists.launchpad.net; dhis2-devs
Subject: Re: [Dhis2-users] [Dhis2-devs] DHIS2 with R

Yeah, this I guess comes back time and time again, with my some what
uncomfortable relationship with Hibernate and Java. Clearly, we need
to think about how to make certain procedures crossplatform compatible
(cross platform in the sense of working between Postgres/MySQL and
other DBs) with the need to offer advanced analysis capabilities, with
acceptable performance.

There could be multiple ways of doing it, but in the absense of having
R integrated into DHIS2, I think the most likely shorterm use case
would be just some documentation on how to use the R client with the
DHIS2 database. Perhaps those users that use R over time with DHIS2
could contribute their procedures, which should be able to be
generalized either with PL/R.

Of course the difference with using Postgres, is that R procedures can
be embedded as a new language inside the DB. I am not really sure this
is possible with MySQL. This of course reduces the internal overhead
of getting the data out of Postgres, through Java, and into the R
interpreter, but I am not sure really what the impact of this might be
without testing it.

On Thu, May 27, 2010 at 12:26 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

On 27 May 2010 11:15, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi Bob,

Yes, I suspect that most R users would probably want to do things
their own way. It has a rather steep learning curve. :slight_smile:

As for canned R scripts, the best way would probably with with PL/R, a
procedural Postgresql language which utilizes R.

http://www.joeconway.com/plr/doc/index.html

I have done some very basic testing and it seems to work just fine on
the server side.

Swings and roundabouts to a certain extent. The main thing is that
the r scripts are evaluated using the r c library. If they were
invoked from within java/dhis then I guess data access would be slower
than from pl/r (we'd need to have a way to get the data to the r
interpreter), but number crunching would be similar and would also
work with mysql and friends. Not sure which of these are bigger
problems in typical/possible scenarios.

I think they are two separate problems really, but I totally agree, C
is likely going to be faster than Java for big operations. However, I
do think (as all of you know) that the use of stored procedures (with
the wrapper facade type of approach) for certain functions (like
aggregation and heavy cross tab operations) would be much better to be
executed on the database server as a native stored procedure.

Regards,
Jason

On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

We've talked before about integrating scripting engine (such as R)
into dhis : rscript - Java to R scripting interface - RForge.net

But my guess is that most R users are going to be of a level of
sophistication that they would be most comfortable doing the kind of
thing you describe - conecting directly to db with r client and doing
their stuff.

OTOH if there were sufficiently useful "canned" dhis R scripts which
could take some number crunching load off the jvm and produce canned
useful analysis then that would be different.

Sadly I don't know sufficient about R to know. But I sense it ...

Regards
Bob

On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

DHIS already has the concept of datamart and report tables, which do
provide some separation of the transactional from the analytical,
though we also have plans for improving this.

R-node with Jquery /GeoExt looks interesting:

···

On Thu, May 27, 2010 at 5:10 PM, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi Roger,

Valid concerns, but I would assume that the typical use case for R
would be that the user would typically only be looking at 1) a view
that has been prepared for them or 2) be provided with read only
access to selected tables. I would not expect that locks would be a
problem in this case.

However, your point is well taken. It is therefore I have been
tinkering around with luciddb, a column-oriented database, that may be
more appropriate for analysis. Which database that should be used is
probably a discussion, but the point is that a separation between the
"transactional" database that is being used for data entry, and the
"analysis" database is a good idea. I would regard this to probably
outside the scope of what DHIS is really intended to do. If people
need to use tools like R, they will likely as well of being capable of
coming up with their own solution.

However, this does not exclude that certain simple examples could be
built into DHIS. Obviously performance is a concern, but of course, it
depends on what you are trying to do. R is incredibly powerful when it
comes to producing graphics as I am sure that you are aware, and
lightyears ahead of the other components we are using (jPlot I think).
So, I would think that the typical use case would be to leverage R,
possibly as an extension to DHIS2 for those that need it, for the
generation of analysis tables and graphics, that would be beyond the
scope of the "basic" package, which is really limited to aggregation.

Anyway, just a few more thoughts.

Regards,
Jason

On Thu, May 27, 2010 at 3:39 PM, Friedman, Roger (CDC/OID/NCHHSTP) > (CTR) <rdf4@cdc.gov> wrote:

My concern would be DB performance, there's no telling what kind of locks R or any other product using odbc/jdbc is going to use. I'm already worried about simultaneous transactional and reporting use. Have there been any large-volume performance tests? Has any thought been given to splitting reporting and data entry between different DB servers? I know everyone has been focused on getting the distributed DB aspects right, but assuming universal availability of internet, how would DHIS2 perform on a single (possibly clustered) national DB server?

-----Original Message-----
From: dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net [mailto:dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net] On Behalf Of Jason Pickering
Sent: Thursday, May 27, 2010 7:03 AM
To: Bob Jolliffe
Cc: dhis2-users@lists.launchpad.net; dhis2-devs
Subject: Re: [Dhis2-users] [Dhis2-devs] DHIS2 with R

Yeah, this I guess comes back time and time again, with my some what
uncomfortable relationship with Hibernate and Java. Clearly, we need
to think about how to make certain procedures crossplatform compatible
(cross platform in the sense of working between Postgres/MySQL and
other DBs) with the need to offer advanced analysis capabilities, with
acceptable performance.

There could be multiple ways of doing it, but in the absense of having
R integrated into DHIS2, I think the most likely shorterm use case
would be just some documentation on how to use the R client with the
DHIS2 database. Perhaps those users that use R over time with DHIS2
could contribute their procedures, which should be able to be
generalized either with PL/R.

Of course the difference with using Postgres, is that R procedures can
be embedded as a new language inside the DB. I am not really sure this
is possible with MySQL. This of course reduces the internal overhead
of getting the data out of Postgres, through Java, and into the R
interpreter, but I am not sure really what the impact of this might be
without testing it.

On Thu, May 27, 2010 at 12:26 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

On 27 May 2010 11:15, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi Bob,

Yes, I suspect that most R users would probably want to do things
their own way. It has a rather steep learning curve. :slight_smile:

As for canned R scripts, the best way would probably with with PL/R, a
procedural Postgresql language which utilizes R.

http://www.joeconway.com/plr/doc/index.html

I have done some very basic testing and it seems to work just fine on
the server side.

Swings and roundabouts to a certain extent. The main thing is that
the r scripts are evaluated using the r c library. If they were
invoked from within java/dhis then I guess data access would be slower
than from pl/r (we'd need to have a way to get the data to the r
interpreter), but number crunching would be similar and would also
work with mysql and friends. Not sure which of these are bigger
problems in typical/possible scenarios.

I think they are two separate problems really, but I totally agree, C
is likely going to be faster than Java for big operations. However, I
do think (as all of you know) that the use of stored procedures (with
the wrapper facade type of approach) for certain functions (like
aggregation and heavy cross tab operations) would be much better to be
executed on the database server as a native stored procedure.

Regards,
Jason

On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

We've talked before about integrating scripting engine (such as R)
into dhis : rscript - Java to R scripting interface - RForge.net

But my guess is that most R users are going to be of a level of
sophistication that they would be most comfortable doing the kind of
thing you describe - conecting directly to db with r client and doing
their stuff.

OTOH if there were sufficiently useful "canned" dhis R scripts which
could take some number crunching load off the jvm and produce canned
useful analysis then that would be different.

Sadly I don't know sufficient about R to know. But I sense it ...

Regards
Bob

On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
Cheers,
Knut Staring

DHIS already has the concept of datamart and report tables, which do
provide some separation of the transactional from the analytical,
though we also have plans for improving this.

R-node with Jquery /GeoExt looks interesting:
R-Node: a web front-end to R with Protovis | R-statistics blog

Slight aside: protoviz looks like a pretty powerful visualization library ..

···

On 27 May 2010 17:27, Knut Staring <knutst@gmail.com> wrote:

On Thu, May 27, 2010 at 5:10 PM, Jason Pickering > <jason.p.pickering@gmail.com> wrote:

Hi Roger,

Valid concerns, but I would assume that the typical use case for R
would be that the user would typically only be looking at 1) a view
that has been prepared for them or 2) be provided with read only
access to selected tables. I would not expect that locks would be a
problem in this case.

However, your point is well taken. It is therefore I have been
tinkering around with luciddb, a column-oriented database, that may be
more appropriate for analysis. Which database that should be used is
probably a discussion, but the point is that a separation between the
"transactional" database that is being used for data entry, and the
"analysis" database is a good idea. I would regard this to probably
outside the scope of what DHIS is really intended to do. If people
need to use tools like R, they will likely as well of being capable of
coming up with their own solution.

However, this does not exclude that certain simple examples could be
built into DHIS. Obviously performance is a concern, but of course, it
depends on what you are trying to do. R is incredibly powerful when it
comes to producing graphics as I am sure that you are aware, and
lightyears ahead of the other components we are using (jPlot I think).
So, I would think that the typical use case would be to leverage R,
possibly as an extension to DHIS2 for those that need it, for the
generation of analysis tables and graphics, that would be beyond the
scope of the "basic" package, which is really limited to aggregation.

Anyway, just a few more thoughts.

Regards,
Jason

On Thu, May 27, 2010 at 3:39 PM, Friedman, Roger (CDC/OID/NCHHSTP) >> (CTR) <rdf4@cdc.gov> wrote:

My concern would be DB performance, there's no telling what kind of locks R or any other product using odbc/jdbc is going to use. I'm already worried about simultaneous transactional and reporting use. Have there been any large-volume performance tests? Has any thought been given to splitting reporting and data entry between different DB servers? I know everyone has been focused on getting the distributed DB aspects right, but assuming universal availability of internet, how would DHIS2 perform on a single (possibly clustered) national DB server?

-----Original Message-----
From: dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net [mailto:dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net] On Behalf Of Jason Pickering
Sent: Thursday, May 27, 2010 7:03 AM
To: Bob Jolliffe
Cc: dhis2-users@lists.launchpad.net; dhis2-devs
Subject: Re: [Dhis2-users] [Dhis2-devs] DHIS2 with R

Yeah, this I guess comes back time and time again, with my some what
uncomfortable relationship with Hibernate and Java. Clearly, we need
to think about how to make certain procedures crossplatform compatible
(cross platform in the sense of working between Postgres/MySQL and
other DBs) with the need to offer advanced analysis capabilities, with
acceptable performance.

There could be multiple ways of doing it, but in the absense of having
R integrated into DHIS2, I think the most likely shorterm use case
would be just some documentation on how to use the R client with the
DHIS2 database. Perhaps those users that use R over time with DHIS2
could contribute their procedures, which should be able to be
generalized either with PL/R.

Of course the difference with using Postgres, is that R procedures can
be embedded as a new language inside the DB. I am not really sure this
is possible with MySQL. This of course reduces the internal overhead
of getting the data out of Postgres, through Java, and into the R
interpreter, but I am not sure really what the impact of this might be
without testing it.

On Thu, May 27, 2010 at 12:26 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

On 27 May 2010 11:15, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi Bob,

Yes, I suspect that most R users would probably want to do things
their own way. It has a rather steep learning curve. :slight_smile:

As for canned R scripts, the best way would probably with with PL/R, a
procedural Postgresql language which utilizes R.

http://www.joeconway.com/plr/doc/index.html

I have done some very basic testing and it seems to work just fine on
the server side.

Swings and roundabouts to a certain extent. The main thing is that
the r scripts are evaluated using the r c library. If they were
invoked from within java/dhis then I guess data access would be slower
than from pl/r (we'd need to have a way to get the data to the r
interpreter), but number crunching would be similar and would also
work with mysql and friends. Not sure which of these are bigger
problems in typical/possible scenarios.

I think they are two separate problems really, but I totally agree, C
is likely going to be faster than Java for big operations. However, I
do think (as all of you know) that the use of stored procedures (with
the wrapper facade type of approach) for certain functions (like
aggregation and heavy cross tab operations) would be much better to be
executed on the database server as a native stored procedure.

Regards,
Jason

On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

We've talked before about integrating scripting engine (such as R)
into dhis : rscript - Java to R scripting interface - RForge.net

But my guess is that most R users are going to be of a level of
sophistication that they would be most comfortable doing the kind of
thing you describe - conecting directly to db with r client and doing
their stuff.

OTOH if there were sufficiently useful "canned" dhis R scripts which
could take some number crunching load off the jvm and produce canned
useful analysis then that would be different.

Sadly I don't know sufficient about R to know. But I sense it ...

Regards
Bob

On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
Cheers,
Knut Staring

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

DHIS already has the concept of datamart and report tables, which do
provide some separation of the transactional from the analytical,
though we also have plans for improving this.

R-node with Jquery /GeoExt looks interesting:
R-Node: a web front-end to R with Protovis | R-statistics blog

Slight aside: protoviz looks like a pretty powerful visualization library ..

Yes! And I like the very nice overview of BI tools and visualizations
in this presentation:
http://squirelove.net/r-node-extra/r-node-plug-10-04-14.html

(from the R-node homepage: http://www.squirelove.net/r-node/doku.php\)

k

···

On Thu, May 27, 2010 at 6:39 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

On 27 May 2010 17:27, Knut Staring <knutst@gmail.com> wrote:

On Thu, May 27, 2010 at 5:10 PM, Jason Pickering >> <jason.p.pickering@gmail.com> wrote:

Hi Roger,

Valid concerns, but I would assume that the typical use case for R
would be that the user would typically only be looking at 1) a view
that has been prepared for them or 2) be provided with read only
access to selected tables. I would not expect that locks would be a
problem in this case.

However, your point is well taken. It is therefore I have been
tinkering around with luciddb, a column-oriented database, that may be
more appropriate for analysis. Which database that should be used is
probably a discussion, but the point is that a separation between the
"transactional" database that is being used for data entry, and the
"analysis" database is a good idea. I would regard this to probably
outside the scope of what DHIS is really intended to do. If people
need to use tools like R, they will likely as well of being capable of
coming up with their own solution.

However, this does not exclude that certain simple examples could be
built into DHIS. Obviously performance is a concern, but of course, it
depends on what you are trying to do. R is incredibly powerful when it
comes to producing graphics as I am sure that you are aware, and
lightyears ahead of the other components we are using (jPlot I think).
So, I would think that the typical use case would be to leverage R,
possibly as an extension to DHIS2 for those that need it, for the
generation of analysis tables and graphics, that would be beyond the
scope of the "basic" package, which is really limited to aggregation.

Anyway, just a few more thoughts.

Regards,
Jason

On Thu, May 27, 2010 at 3:39 PM, Friedman, Roger (CDC/OID/NCHHSTP) >>> (CTR) <rdf4@cdc.gov> wrote:

My concern would be DB performance, there's no telling what kind of locks R or any other product using odbc/jdbc is going to use. I'm already worried about simultaneous transactional and reporting use. Have there been any large-volume performance tests? Has any thought been given to splitting reporting and data entry between different DB servers? I know everyone has been focused on getting the distributed DB aspects right, but assuming universal availability of internet, how would DHIS2 perform on a single (possibly clustered) national DB server?

-----Original Message-----
From: dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net [mailto:dhis2-users-bounces+rdf4=cdc.gov@lists.launchpad.net] On Behalf Of Jason Pickering
Sent: Thursday, May 27, 2010 7:03 AM
To: Bob Jolliffe
Cc: dhis2-users@lists.launchpad.net; dhis2-devs
Subject: Re: [Dhis2-users] [Dhis2-devs] DHIS2 with R

Yeah, this I guess comes back time and time again, with my some what
uncomfortable relationship with Hibernate and Java. Clearly, we need
to think about how to make certain procedures crossplatform compatible
(cross platform in the sense of working between Postgres/MySQL and
other DBs) with the need to offer advanced analysis capabilities, with
acceptable performance.

There could be multiple ways of doing it, but in the absense of having
R integrated into DHIS2, I think the most likely shorterm use case
would be just some documentation on how to use the R client with the
DHIS2 database. Perhaps those users that use R over time with DHIS2
could contribute their procedures, which should be able to be
generalized either with PL/R.

Of course the difference with using Postgres, is that R procedures can
be embedded as a new language inside the DB. I am not really sure this
is possible with MySQL. This of course reduces the internal overhead
of getting the data out of Postgres, through Java, and into the R
interpreter, but I am not sure really what the impact of this might be
without testing it.

On Thu, May 27, 2010 at 12:26 PM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

On 27 May 2010 11:15, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi Bob,

Yes, I suspect that most R users would probably want to do things
their own way. It has a rather steep learning curve. :slight_smile:

As for canned R scripts, the best way would probably with with PL/R, a
procedural Postgresql language which utilizes R.

http://www.joeconway.com/plr/doc/index.html

I have done some very basic testing and it seems to work just fine on
the server side.

Swings and roundabouts to a certain extent. The main thing is that
the r scripts are evaluated using the r c library. If they were
invoked from within java/dhis then I guess data access would be slower
than from pl/r (we'd need to have a way to get the data to the r
interpreter), but number crunching would be similar and would also
work with mysql and friends. Not sure which of these are bigger
problems in typical/possible scenarios.

I think they are two separate problems really, but I totally agree, C
is likely going to be faster than Java for big operations. However, I
do think (as all of you know) that the use of stored procedures (with
the wrapper facade type of approach) for certain functions (like
aggregation and heavy cross tab operations) would be much better to be
executed on the database server as a native stored procedure.

Regards,
Jason

On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <bobjolliffe@gmail.com> wrote:

We've talked before about integrating scripting engine (such as R)
into dhis : rscript - Java to R scripting interface - RForge.net

But my guess is that most R users are going to be of a level of
sophistication that they would be most comfortable doing the kind of
thing you describe - conecting directly to db with r client and doing
their stuff.

OTOH if there were sufficiently useful "canned" dhis R scripts which
could take some number crunching load off the jvm and produce canned
useful analysis then that would be different.

Sadly I don't know sufficient about R to know. But I sense it ...

Regards
Bob

On 27 May 2010 10:08, Jason Pickering <jason.p.pickering@gmail.com> wrote:

Hi everyone. I have had a recent question from a user about how DHIS2
can be used with R. I am including a trivial example here about how to
use R as as a client to access data and produce a graph in DHIS2.

Just get a copy of R and install the DBI and RPostregSQL packages with

install.packages()

After that, just connect to the DB, retrieve your data (in this case
from a report table) and produce a graph.

library(DBI)

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")

con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", password="postgres")

rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district where

organisationunitid = 3904")

data <- fetch(rs,n=-1)

barplot(data$malaria_confirm_incidence, names.arg=as.character(data$periodname), main=as.character(data$organisationunitname[1]),las=2)

dev.print(png, file="/home/jason/test.png")

Regards,
Jason

---
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

_______________________________________________
Mailing list: DHIS 2 Users in Launchpad
Post to : dhis2-users@lists.launchpad.net
Unsubscribe : DHIS 2 Users in Launchpad
More help : ListHelp - Launchpad Help

--
--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+260968395190

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
Cheers,
Knut Staring

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
Cheers,
Knut Staring