GSOC_2014 Scriptable SMS parsing

Hi All

I’m a MPhil student at University of Colombo School of Computing. Also a successful GSOCer for Dbpedia project last year. My current research includes Information retrieval, NLP, Semantic web, Ontologies ect. I have also done some research in the aria of Disaster management and ICT4D.

I’m interested in above idea. I went through the some of the discussion treads of above idea on the maling list and read the Saptarshi’s and Peder’s replies.

···

Here is my draft proposal for above idea.

In mobile environment target texts include many morphological variations (e.g. blank omission, typos, word abbreviation). Parsing is not feasible for messages in which both the syntax and the spelling are unreliable. So my idea is to propose an alternative approach which uses pattern matching rather than parsing. The idea is to try to effect the equivalent of a parse by matching each SMS against a number of a patterns appropriate to the DHIS data elements. The proposed approach would matches sentences with manually generated templates initially (later to be extended semi-automatic approach) to identify DHIS data elements and the data values associated with them.

e.g. template

“The of <<SomeValue/Element>> is <>”

Do you think this is a good way of achieving above task?


Regards

Kasun Perera

Hi Kasun,

Thanks for your interest in DHIS2 and the SMS parsing project

The proposal is quite short and its not clear to me the steps you plan to complete the project.

Detailing what pattern matching means, how templates are generated, what do you mean by semi-automatic… will improve your proposal.

Good luck for the proposal

···

Regards,
Saptarshi PURKAYASTHA

On 14 March 2014 09:14, kasun perera kkasunperera@gmail.com wrote:

Hi All

I’m a MPhil student at University of Colombo School of Computing. Also a successful GSOCer for Dbpedia project last year. My current research includes Information retrieval, NLP, Semantic web, Ontologies ect. I have also done some research in the aria of Disaster management and ICT4D.

I’m interested in above idea. I went through the some of the discussion treads of above idea on the maling list and read the Saptarshi’s and Peder’s replies.


Here is my draft proposal for above idea.

In mobile environment target texts include many morphological variations (e.g. blank omission, typos, word abbreviation). Parsing is not feasible for messages in which both the syntax and the spelling are unreliable. So my idea is to propose an alternative approach which uses pattern matching rather than parsing. The idea is to try to effect the equivalent of a parse by matching each SMS against a number of a patterns appropriate to the DHIS data elements. The proposed approach would matches sentences with manually generated templates initially (later to be extended semi-automatic approach) to identify DHIS data elements and the data values associated with them.

e.g. template

“The of <<SomeValue/Element>> is <>”

Do you think this is a good way of achieving above task?


Regards

Kasun Perera


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Hi Saptarshi

I have submitted my detailed proposal addressing above requirement to the Melange system. I’m glad if you can give some feedback on my proposal.

Thanks

···

On Mon, Mar 17, 2014 at 4:27 AM, Saptarshi Purkayastha sunbiz@gmail.com wrote:

Hi Kasun,

Thanks for your interest in DHIS2 and the SMS parsing project

The proposal is quite short and its not clear to me the steps you plan to complete the project.

Detailing what pattern matching means, how templates are generated, what do you mean by semi-automatic… will improve your proposal.

Good luck for the proposal


Regards

Kasun Perera


Regards,
Saptarshi PURKAYASTHA

On 14 March 2014 09:14, kasun perera kkasunperera@gmail.com wrote:

Hi All

I’m a MPhil student at University of Colombo School of Computing. Also a successful GSOCer for Dbpedia project last year. My current research includes Information retrieval, NLP, Semantic web, Ontologies ect. I have also done some research in the aria of Disaster management and ICT4D.

I’m interested in above idea. I went through the some of the discussion treads of above idea on the maling list and read the Saptarshi’s and Peder’s replies.


Here is my draft proposal for above idea.

In mobile environment target texts include many morphological variations (e.g. blank omission, typos, word abbreviation). Parsing is not feasible for messages in which both the syntax and the spelling are unreliable. So my idea is to propose an alternative approach which uses pattern matching rather than parsing. The idea is to try to effect the equivalent of a parse by matching each SMS against a number of a patterns appropriate to the DHIS data elements. The proposed approach would matches sentences with manually generated templates initially (later to be extended semi-automatic approach) to identify DHIS data elements and the data values associated with them.

e.g. template

“The of <<SomeValue/Element>> is <>”

Do you think this is a good way of achieving above task?


Regards

Kasun Perera


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp