Documentation source moving

Hello,

Because of some huge commits of images from me, the documentation
branch in Bazaar has become almost unusable for people on slow lines,
as it now is 168MB to check out.

We are therefore planning to create new branches, either in
Launchpad/Bazaar or in Google Code/Subversion.

Knut

Hi Knut,
Thanks for your efforts on this. I am reluctant to move the source
just yet. I think we need to think through the translation workflow a
bit better, and not introduce too many changes at once. We have seen a
few more commits from others, which is a good thing, but of course,
getting involved in the documentation effort is still pretty
complicated with bzr, DocBook, etc. Introducing more tools like PO and
moving the source and images out of launchpad may just complicate
things. Lets try and keep it as simple as possible, without too many
complications.

Firstly, I have removed all the revisions up until 214, which I think
is where all the commit related to multilingual documents started.

Going forward I would suggest that we fork each language branch, into
a seperate tree, in order to cut down on the size of the individual
branches. My suggestion is that each language be placed in a seperate
tree, that is branched from the main English documentation. This seems
to make sense to me, as most people will not need every language, but
would be more interested on working on a particular language. So, for
instance to fork the current branch you would do this..

1) Fork the branch. bzr branch
lp:~dhis2-documenters/dhis2/dhis2-docbook-docs dhis2-docbooks-docs-fr
2) Make all the necessary changes that are required. Transform the XML
to PO files, machine translate it, revise the PO files, transform back
to XML, and modify the pom.xml file to suit your needs.
3) Push the changes back to launchpad in a new branch "bzr push
lp:~dhis2-documenters/dhis2/dhis2-docbook-docs-fr/"

I have followed this workflow, and create this new branch.

I think this should be a more sustainable workflow and hopefully (with
more careful) management of the images, keep the size of the branches
down to a reasonable size. I think it will also help to encourage
ownership of each language. People would be free to translate back and
forth from each branch, but I assume for the time being, most stuff
would be translated from English into other languages.

Also, using the command bzr checkout --lightweight
lp:~dhis2-documenters/dhis2/dhis2-docbook-docs will only pull the
latest revision, but not all the history, which would not be entirely
necessary when creating a fork for a given language from the latest
English version. This should cut down a lot on the amount of data
that needs to be transferred.

If this sounds acceptable to everyone, I can write this up in the
Documentation guide.

Best regards,
Jason

···

On Mon, Sep 6, 2010 at 6:41 PM, Knut Staring <knutst@gmail.com> wrote:

Hello,

Because of some huge commits of images from me, the documentation
branch in Bazaar has become almost unusable for people on slow lines,
as it now is 168MB to check out.

We are therefore planning to create new branches, either in
Launchpad/Bazaar or in Google Code/Subversion.

Knut

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+17069260025
sip:jason.p.pickering@ekiga.net

Thanks, Jason, many good ideas. However, I don’t think

,sent from my mobile

···

On Sep 7, 2010 6:32 AM, “Jason Pickering” jason.p.pickering@gmail.com wrote:

Hi Knut,

Thanks for your efforts on this. I am reluctant to move the source

just yet. I think we need to think through the translation workflow a

bit better, and not introduce too many changes at once. We have seen a

few more commits from others, which is a good thing, but of course,

getting involved in the documentation effort is still pretty

complicated with bzr, DocBook, etc. Introducing more tools like PO and

moving the source and images out of launchpad may just complicate

things. Lets try and keep it as simple as possible, without too many

complications.

Firstly, I have removed all the revisions up until 214, which I think

is where all the commit related to multilingual documents started.

Going forward I would suggest that we fork each language branch, into

a seperate tree, in order to cut down on the size of the individual

branches. My suggestion is that each language be placed in a seperate

tree, that is branched from the main English documentation. This seems

to make sense to me, as most people will not need every language, but

would be more interested on working on a particular language. So, for

instance to fork the current branch you would do this…

  1. Fork the branch. bzr branch

lp:~dhis2-documenters/dhis2/dhis2-docbook-docs dhis2-docbooks-docs-fr

  1. Make all the necessary changes that are required. Transform the XML

to PO files, machine translate it, revise the PO files, transform back

to XML, and modify the pom.xml file to suit your needs.

  1. Push the changes back to launchpad in a new branch "bzr push

lp:~dhis2-documenters/dhis2/dhis2-docbook-docs-fr/"

I have followed this workflow, and create this new branch.

I think this should be a more sustainable workflow and hopefully (with

more careful) management of the images, keep the size of the branches

down to a reasonable size. I think it will also help to encourage

ownership of each language. People would be free to translate back and

forth from each branch, but I assume for the time being, most stuff

would be translated from English into other languages.

Also, using the command bzr checkout --lightweight

lp:~dhis2-documenters/dhis2/dhis2-docbook-docs will only pull the

latest revision, but not all the history, which would not be entirely

necessary when creating a fork for a given language from the latest

English version. This should cut down a lot on the amount of data

that needs to be transferred.

If this sounds acceptable to everyone, I can write this up in the

Documentation guide.

Best regards,

Jason

On Mon, Sep 6, 2010 at 6:41 PM, Knut Staring knutst@gmail.com wrote:

Hello,

Because of s…


Mailing list: https://launchpad.net/~dhis2-devs

Post to : dhis2-devs@lists.launchpad.net

Unsubscribe : https://launchpad.net/~dhis2-devs

More help : https://help.launchpad.net/ListHelp

Jason P. Pickering

email: jason.p.pickering@gmail.com

tel:+17069260025

sip:jason.p.pickering@ekiga.net

Thanks a lot Jason. Comments below.

Hi Knut,
Thanks for your efforts on this. I am reluctant to move the source
just yet. I think we need to think through the translation workflow a
bit better, and not introduce too many changes at once. We have seen a
few more commits from others, which is a good thing, but of course,
getting involved in the documentation effort is still pretty
complicated with bzr, DocBook, etc. Introducing more tools like PO and
moving the source and images out of launchpad may just complicate
things. Lets try and keep it as simple as possible, without too many
complications.

I agree very much with keeping things simple. To me, introducing PO is
not an additional step to an already complicated process, but a way to
allow translators to bypass the whole process: They don't have to get
involved with Maven, Bazaar, or Serna or XML, just POedit.
This means that someone else will have to handle the structure of the
document and checking in translations, but I think that is a lot
easier than providing remote support to people who are not using the
above tools on a regular basis. So basically I suggest email and
POedit as the only tools needed, which I think will vastly enlarge the
pool of possible collaborators.

Firstly, I have removed all the revisions up until 214, which I think
is where all the commit related to multilingual documents started.

Thanks, that was needed.

Going forward I would suggest that we fork each language branch, into
a seperate tree, in order to cut down on the size of the individual
branches. My suggestion is that each language be placed in a seperate
tree, that is branched from the main English documentation. This seems
to make sense to me, as most people will not need every language, but
would be more interested on working on a particular language.

I agree that most people only need one language and perhaps English
and that the size of the screenshots means some kind of splitting up,
so people don't have to download hundreds of MB. There is also merit
to allowing each language community to manage their own documentation
needs, as the VN team has already done by introducing their own
filenames and structure.

However, in my opinion, we need to think along the lines of the HISP
concept of minimal common dataset, and concentrate on a common core of
documentation with a unified structure. Any changes for a particular
language should be as addition. There are two main reasons for this -
inline help in the DHIS2 application, and the need to manage the
overall translation process.

We now need a system for multiple languages in
dhis-2\dhis-services\dhis-service-options\src\main\resources, e.g.
help_content.en.xml, help_content.fr.xml etc. And then of course the
right one must be chosen according to the user GUI setting.

For this to work well, we need to maintain exactly the same structure
for all languages. This means that we only maintain one set of Docbook
files, namely for English. Whenever structural changes need to be
made, they should be made in these master XML files. For the other
languages, we only need PO files with translations. A first set of PO
files can be generated using Google Translate, but then must be turned
over to translators (with POedit as the only tool).

So English remains the master language, which I think is only
realistic, looking at the project. And it will be easy to point people
to the right documentation without even knowing a language. Thus we
should not fork the text at all (it doesn't get very big anyway), but
keep it all in the current structure. Instead, we should remove the
/images folder and check it into a separate branch. If people want to
create separate screenshots for other languages such as French, these
should be in yet another branch. Alternatively, we could place all
screenshots on another server for download, outside of version
control.

This means a minimal adjustment to what we currently have (just moving
the images), but of course the instructions for building must explain
exactly what to do with the images. It could of course be possible to
script this also, though not sure it is worth the effort.

So, for
instance to fork the current branch you would do this..

1) Fork the branch. bzr branch
lp:~dhis2-documenters/dhis2/dhis2-docbook-docs dhis2-docbooks-docs-fr
2) Make all the necessary changes that are required. Transform the XML
to PO files, machine translate it, revise the PO files, transform back
to XML, and modify the pom.xml file to suit your needs.
3) Push the changes back to launchpad in a new branch "bzr push
lp:~dhis2-documenters/dhis2/dhis2-docbook-docs-fr/"

I have followed this workflow, and create this new branch.

I think this should be a more sustainable workflow and hopefully (with
more careful) management of the images, keep the size of the branches
down to a reasonable size. I think it will also help to encourage
ownership of each language. People would be free to translate back and
forth from each branch, but I assume for the time being, most stuff
would be translated from English into other languages.

Also, using the command bzr checkout --lightweight
lp:~dhis2-documenters/dhis2/dhis2-docbook-docs will only pull the
latest revision

This is very useful to know, and should be highlighted on dhis2.org
and in the documentation. It would be good to have a bit more bazaar
commands in the docs.

Knut

···

On Tue, Sep 7, 2010 at 6:32 AM, Jason Pickering <jason.p.pickering@gmail.com> wrote:

but not all the history, which would not be entirely
necessary when creating a fork for a given language from the latest
English version. This should cut down a lot on the amount of data
that needs to be transferred.

If this sounds acceptable to everyone, I can write this up in the
Documentation guide.

Best regards,
Jason

On Mon, Sep 6, 2010 at 6:41 PM, Knut Staring <knutst@gmail.com> wrote:

Hello,

Because of some huge commits of images from me, the documentation
branch in Bazaar has become almost unusable for people on slow lines,
as it now is 168MB to check out.

We are therefore planning to create new branches, either in
Launchpad/Bazaar or in Google Code/Subversion.

Knut

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+17069260025
sip:jason.p.pickering@ekiga.net

--
Cheers,
Knut Staring

Hi Knut,
I think we are perhaps discussing two separate parts of the process.

Here is the scenario I am thinking about. A translation manager
(someone who is familar with bzr, docbook, serna, maven, and Launchad)
would initiate a translation process. They would fork the English
language branch and produce PO files for a given language. These would
be machine translated in the first instance, and passed onto
translators. The translators would polish the PO files, and deliver
these back (via email, USB, etc) to the translation manager. The
translation manager would reproduce the DocBook XML from the
translated PO files, and likely coordinate any replacement of screen
shots in the native language. Changes to the pom.xml would need to be
affected as well. Once all of this is done, they would push the
revised changes back to a language specific documentation branch,
which would essentially be a standalone branch capable of producing
HTML, PDF and so forth by itself.

So I think we actually agree with each other for the most part.This
workflow would "insulate" the translators from unnecessary
complications, but not necessarily exclude those translators who would
be comfortable working with these tools. We can always merge the
branches back together if needed, but keeping them forked would seem
to make them more manageable, as well as creating a sense of
ownership.

As for the inline help, I agree with you that the document structure
would need to be exactly the same. However, it is not so straight
forward at this point exactly how this is going to happen. The XML in
the inline help has been extracted, I think, manually. Theoretically,
we could produce a given document which has used the lang=fr tag to
indicate that for a given <id> in the document, that there are two
languages available. We will need to implement this in the existing
document somehow, along possibly, with processing instructions or an
XSL to extract out a secondary XML file for the inline help.

Regards,
Jason

···

On Tue, Sep 7, 2010 at 8:10 AM, Knut Staring <knutst@gmail.com> wrote:

Thanks a lot Jason. Comments below.

On Tue, Sep 7, 2010 at 6:32 AM, Jason Pickering > <jason.p.pickering@gmail.com> wrote:

Hi Knut,
Thanks for your efforts on this. I am reluctant to move the source
just yet. I think we need to think through the translation workflow a
bit better, and not introduce too many changes at once. We have seen a
few more commits from others, which is a good thing, but of course,
getting involved in the documentation effort is still pretty
complicated with bzr, DocBook, etc. Introducing more tools like PO and
moving the source and images out of launchpad may just complicate
things. Lets try and keep it as simple as possible, without too many
complications.

I agree very much with keeping things simple. To me, introducing PO is
not an additional step to an already complicated process, but a way to
allow translators to bypass the whole process: They don't have to get
involved with Maven, Bazaar, or Serna or XML, just POedit.
This means that someone else will have to handle the structure of the
document and checking in translations, but I think that is a lot
easier than providing remote support to people who are not using the
above tools on a regular basis. So basically I suggest email and
POedit as the only tools needed, which I think will vastly enlarge the
pool of possible collaborators.

Firstly, I have removed all the revisions up until 214, which I think
is where all the commit related to multilingual documents started.

Thanks, that was needed.

Going forward I would suggest that we fork each language branch, into
a seperate tree, in order to cut down on the size of the individual
branches. My suggestion is that each language be placed in a seperate
tree, that is branched from the main English documentation. This seems
to make sense to me, as most people will not need every language, but
would be more interested on working on a particular language.

I agree that most people only need one language and perhaps English
and that the size of the screenshots means some kind of splitting up,
so people don't have to download hundreds of MB. There is also merit
to allowing each language community to manage their own documentation
needs, as the VN team has already done by introducing their own
filenames and structure.

However, in my opinion, we need to think along the lines of the HISP
concept of minimal common dataset, and concentrate on a common core of
documentation with a unified structure. Any changes for a particular
language should be as addition. There are two main reasons for this -
inline help in the DHIS2 application, and the need to manage the
overall translation process.

We now need a system for multiple languages in
dhis-2\dhis-services\dhis-service-options\src\main\resources, e.g.
help_content.en.xml, help_content.fr.xml etc. And then of course the
right one must be chosen according to the user GUI setting.

For this to work well, we need to maintain exactly the same structure
for all languages. This means that we only maintain one set of Docbook
files, namely for English. Whenever structural changes need to be
made, they should be made in these master XML files. For the other
languages, we only need PO files with translations. A first set of PO
files can be generated using Google Translate, but then must be turned
over to translators (with POedit as the only tool).

So English remains the master language, which I think is only
realistic, looking at the project. And it will be easy to point people
to the right documentation without even knowing a language. Thus we
should not fork the text at all (it doesn't get very big anyway), but
keep it all in the current structure. Instead, we should remove the
/images folder and check it into a separate branch. If people want to
create separate screenshots for other languages such as French, these
should be in yet another branch. Alternatively, we could place all
screenshots on another server for download, outside of version
control.

This means a minimal adjustment to what we currently have (just moving
the images), but of course the instructions for building must explain
exactly what to do with the images. It could of course be possible to
script this also, though not sure it is worth the effort.

So, for
instance to fork the current branch you would do this..

1) Fork the branch. bzr branch
lp:~dhis2-documenters/dhis2/dhis2-docbook-docs dhis2-docbooks-docs-fr
2) Make all the necessary changes that are required. Transform the XML
to PO files, machine translate it, revise the PO files, transform back
to XML, and modify the pom.xml file to suit your needs.
3) Push the changes back to launchpad in a new branch "bzr push
lp:~dhis2-documenters/dhis2/dhis2-docbook-docs-fr/"

I have followed this workflow, and create this new branch.

I think this should be a more sustainable workflow and hopefully (with
more careful) management of the images, keep the size of the branches
down to a reasonable size. I think it will also help to encourage
ownership of each language. People would be free to translate back and
forth from each branch, but I assume for the time being, most stuff
would be translated from English into other languages.

Also, using the command bzr checkout --lightweight
lp:~dhis2-documenters/dhis2/dhis2-docbook-docs will only pull the
latest revision

This is very useful to know, and should be highlighted on dhis2.org
and in the documentation. It would be good to have a bit more bazaar
commands in the docs.

Knut

but not all the history, which would not be entirely
necessary when creating a fork for a given language from the latest
English version. This should cut down a lot on the amount of data
that needs to be transferred.

If this sounds acceptable to everyone, I can write this up in the
Documentation guide.

Best regards,
Jason

On Mon, Sep 6, 2010 at 6:41 PM, Knut Staring <knutst@gmail.com> wrote:

Hello,

Because of some huge commits of images from me, the documentation
branch in Bazaar has become almost unusable for people on slow lines,
as it now is 168MB to check out.

We are therefore planning to create new branches, either in
Launchpad/Bazaar or in Google Code/Subversion.

Knut

_______________________________________________
Mailing list: DHIS 2 developers in Launchpad
Post to : dhis2-devs@lists.launchpad.net
Unsubscribe : DHIS 2 developers in Launchpad
More help : ListHelp - Launchpad Help

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+17069260025
sip:jason.p.pickering@ekiga.net

--
Cheers,
Knut Staring

--
Jason P. Pickering
email: jason.p.pickering@gmail.com
tel:+17069260025
sip:jason.p.pickering@ekiga.net