Thread: French PDF manual

French PDF manual

From
Guillaume LELARGE
Date:
Hi,

Michael Glaesemann asked me to share details about our toolchain to
build a french PDF manual. So here are the details.

We started by changing SGML files in a way that makes them syntactically
correct and valid with an XML toolchain :
 - add <?xml... on the beginning of each file ;
 - change </> tag with the correct ending one ;
 - add / for unitary tags (xref for example) ;
 - change <!entity with <!ENTITY ;
 - *delete* standalone ignore and include tags but keep some of the
   text ;
 - put all id, linkend, endterm in lowercase ;
 - etc (I probably have forgotten some).

Then we used XSLT stylesheets from the LFS project to build XHTML and
PDF manual. We used xsltproc and fop 0.20.5. And, after much tweaking
(to get good computeroutput, to fix columns' size of each table), we
finaly got this PDF :
  http://docs.postgresqlfr.org/pgsql-8.1.3-fr/pg813.pdf

Of course, you've seen the only problem for your move to XML : we
deleted standalone tags. But I think the result deserv some more work. I
would be glad to work on this if you think this could be a useful
addition to the project.

FYI, building the PDF file take me about 6 minutes on a Athlon 2.2 GHz
with 1 GB of RAM. I think this is good news :) Bad news is it takes
about half an hour to build the HTML one... pretty ugly... I don't
really know why it takes so much time.

If you're still interested on this work, should I work on 8.1 branch or
on HEAD ?

Hope you find this useful and forgive my bad english,

Regards.


--
Guillaume.

Re: French PDF manual

From
"Guillaume Lelarge"
Date:
Hi,

2006/5/4, Guillaume LELARGE <guillaume.lelarge@gmail.com>:
> [...]
> Of course, you've seen the only problem for your move to XML : we
> deleted standalone tags. But I think the result deserv some more work. I
> would be glad to work on this if you think this could be a useful
> addition to the project.
>

Well, it was really easy to do. Instead of putting standalone tags,
you just have to add a parameter and its value on each tag that
depends on the kind of document you build. For example, I put
'standalone="yes"' when I want that the tag's content only appears on
standalone mode and I put 'standalone="no"' when I want that the tag's
content only appears on book mode.

Then I give these parameters to xsltprc so that it knows which mode I ask.

http://svn.postgresqlfr.org/changeset/242 will give you every change I
made to make it work.

My PDF file is still the same but I can now build the INSTALL.html document :
  http://docs.postgresqlfr.org/pgsql-8.1.3-fr-ng/standalone-install.html

I'm still willing to do or help you on this matter.


--
Guillaume.

Re: French PDF manual

From
Peter Eisentraut
Date:
Guillaume LELARGE wrote:
> We started by changing SGML files in a way that makes them
> syntactically correct and valid with an XML toolchain :

Well, you can go into the documentation source directory and type 'make
postgres.xml' to get an XML formatted DocBook file automatically.

It has occasionally been suggested to convert the source of the
documentation to XML, but while I'm not opposed to that, I don't see
any specific advantages coming from such a move.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: French PDF manual

From
Alvaro Herrera
Date:
Peter Eisentraut wrote:
> Guillaume LELARGE wrote:
> > We started by changing SGML files in a way that makes them
> > syntactically correct and valid with an XML toolchain :
>
> Well, you can go into the documentation source directory and type 'make
> postgres.xml' to get an XML formatted DocBook file automatically.
>
> It has occasionally been suggested to convert the source of the
> documentation to XML, but while I'm not opposed to that, I don't see
> any specific advantages coming from such a move.

It would allow us to use xml2po to maintain translations, for one.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: French PDF manual

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Peter Eisentraut wrote:
>> Well, you can go into the documentation source directory and type 'make
>> postgres.xml' to get an XML formatted DocBook file automatically.
>>
>> It has occasionally been suggested to convert the source of the
>> documentation to XML, but while I'm not opposed to that, I don't see
>> any specific advantages coming from such a move.

> It would allow us to use xml2po to maintain translations, for one.

Sounds like you can use that today, if you want to; it's just one extra
step.  But would it really be helpful on the documentation?  I see where
.po works for a lot of short, independent error messages, but I don't
see it being real useful for big manuscripts.

            regards, tom lane

Re: French PDF manual

From
"Mario Gonzalez"
Date:
On 08/05/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Peter Eisentraut wrote:
> >> Well, you can go into the documentation source directory and type 'make
> >> postgres.xml' to get an XML formatted DocBook file automatically.
> >>
> >> It has occasionally been suggested to convert the source of the
> >> documentation to XML, but while I'm not opposed to that, I don't see
> >> any specific advantages coming from such a move.
>
> > It would allow us to use xml2po to maintain translations, for one.
>
> Sounds like you can use that today, if you want to; it's just one extra
> step.  But would it really be helpful on the documentation?  I see where
> .po works for a lot of short, independent error messages, but I don't
> see it being real useful for big manuscripts.
>

 Hello! my name is Mario and I've been working in that. For learning
purposes I'm writing the FAQ in a xml document. And, like Alvaro said,
«it would allow us to use xml2po»

   So, what can we use for big documents instead?

>

Re: French PDF manual

From
Guillaume LELARGE
Date:
Peter Eisentraut a écrit :
> Guillaume LELARGE wrote:
>> We started by changing SGML files in a way that makes them
>> syntactically correct and valid with an XML toolchain :
>
> Well, you can go into the documentation source directory and type 'make
> postgres.xml' to get an XML formatted DocBook file automatically.
>
> It has occasionally been suggested to convert the source of the
> documentation to XML, but while I'm not opposed to that, I don't see
> any specific advantages coming from such a move.
>

The PDF manual available on the postgresql web site is quite difficult
to use. For example, take a look at table 9.5 (pages 125 and 126). Text
goes beyond the cells. It's really difficult to read it. Sometimes you
can't read the text because it goes beyond the page : see pages 223 and
345).

I think these are real important issues.


--
Guillaume.


Re: French PDF manual

From
Peter Eisentraut
Date:
Guillaume LELARGE wrote:
> The PDF manual available on the postgresql web site is quite
> difficult to use. For example, take a look at table 9.5 (pages 125
> and 126). Text goes beyond the cells. It's really difficult to read
> it. Sometimes you can't read the text because it goes beyond the page
> : see pages 223 and 345).
>
> I think these are real important issues.

Certainly, but what do they have to do with the question whether the
source format of the documentation should be XML or SGML?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: French PDF manual

From
"Guillaume Lelarge"
Date:
2006/5/9, Peter Eisentraut <peter_e@gmx.net>:
> Guillaume LELARGE wrote:
> > The PDF manual available on the postgresql web site is quite
> > difficult to use. For example, take a look at table 9.5 (pages 125
> > and 126). Text goes beyond the cells. It's really difficult to read
> > it. Sometimes you can't read the text because it goes beyond the page
> > : see pages 223 and 345).
> >
> > I think these are real important issues.
>
> Certainly, but what do they have to do with the question whether the
> source format of the documentation should be XML or SGML?
>

OK, I understand what you mean. I don't know if there is a fix for
this issue with sgml and the jade toolkit but I do know there is one
available with xml and the xsltproc toolkit. And I'm available to do
it.

If someone wants to work on sgml/jade/dsssl stylesheets to fix the
issues I talked earlier, great, good news. But I haven't seen anyone
talking about this. So perhaps, the good way to handle this is to go
the xml way.

Regards.


--
Guillaume.

Re: French PDF manual

From
Peter Eisentraut
Date:
Am Dienstag, 9. Mai 2006 11:19 schrieb Guillaume Lelarge:
> If someone wants to work on sgml/jade/dsssl stylesheets to fix the
> issues I talked earlier, great, good news. But I haven't seen anyone
> talking about this. So perhaps, the good way to handle this is to go
> the xml way.

I don't think you got my point.  The SGML sources can be converted to XML
automatically.  The fact that the XSLT toolchain might work better is no
reason to convert the sources manually.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: French PDF manual

From
Alvaro Herrera
Date:
Peter Eisentraut wrote:
> Am Dienstag, 9. Mai 2006 11:19 schrieb Guillaume Lelarge:
> > If someone wants to work on sgml/jade/dsssl stylesheets to fix the
> > issues I talked earlier, great, good news. But I haven't seen anyone
> > talking about this. So perhaps, the good way to handle this is to go
> > the xml way.
>
> I don't think you got my point.  The SGML sources can be converted to XML
> automatically.  The fact that the XSLT toolchain might work better is no
> reason to convert the sources manually.

I have a counter-question.  What value is there in continuing to use
SGML?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: French PDF manual

From
"Guillaume Lelarge"
Date:
2006/5/9, Peter Eisentraut <peter_e@gmx.net>:
> Am Dienstag, 9. Mai 2006 11:19 schrieb Guillaume Lelarge:
> > If someone wants to work on sgml/jade/dsssl stylesheets to fix the
> > issues I talked earlier, great, good news. But I haven't seen anyone
> > talking about this. So perhaps, the good way to handle this is to go
> > the xml way.
>
> I don't think you got my point.  The SGML sources can be converted to XML
> automatically.  The fact that the XSLT toolchain might work better is no
> reason to convert the sources manually.
>

How do you take care of the columns' size in a table ? I use this :
  <colspec colnum="1" colwidth="1.5*"/>
But I don't know if there is the same thing in SGML.

This is something you need to fix to get proper content in a table's cells.


--
Guillaume.

Re: French PDF manual

From
Peter Eisentraut
Date:
Am Dienstag, 9. Mai 2006 14:00 schrieb Guillaume Lelarge:
> How do you take care of the columns' size in a table ? I use this :
>   <colspec colnum="1" colwidth="1.5*"/>
> But I don't know if there is the same thing in SGML.

DocBook is DocBook.  It doesn't matter if it's written in SGML or XML.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: French PDF manual

From
Peter Eisentraut
Date:
Am Dienstag, 9. Mai 2006 13:40 schrieb Alvaro Herrera:
> I have a counter-question.  What value is there in continuing to use
> SGML?

Tag reduction makes editing easier (for some).

There's the question whether all the files should be renamed to .xml.

Marked sections would need to be replaced with a profiling mechanism.  The
mechanisms for this exist but they are not all that elegant.

I'm all in favor of considering a move to XML, but we need to work with the
facts.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: French PDF manual

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I have a counter-question.  What value is there in continuing to use
> SGML?

We're already used to it, and it's not clear what we'd buy from the
effort of converting all our documentation.

            regards, tom lane

Re: French PDF manual

From
Peter Eisentraut
Date:
Bruce Momjian wrote:
> Should we modify the doc build process to generate XML and convert
> that to PDF, or allow it as an option, because PDF generation has
> always been a problem from SGML (very slow)?

If the FOP-based build works better than Jade, by all means let's use
that.  Last time I looked at FOP (years ago, admittedly), it broke at
about page 10.  Obviously, Guillaume and company are getting better
results now.

What I would like to know, however, is whether FOP runs on a Free(tm)
Java implementation.  It seems that you need Sun's JDK to run it and
that will be a problem if it's to become our primary build method.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: French PDF manual

From
"Guillaume Lelarge"
Date:
2006/5/9, Peter Eisentraut <peter_e@gmx.net>:
> Bruce Momjian wrote:
> > Should we modify the doc build process to generate XML and convert
> > that to PDF, or allow it as an option, because PDF generation has
> > always been a problem from SGML (very slow)?
>
> If the FOP-based build works better than Jade, by all means let's use
> that.  Last time I looked at FOP (years ago, admittedly), it broke at
> about page 10.  Obviously, Guillaume and company are getting better
> results now.
>

We are getting better results in quality. Speed is quite different :
really quick to build a PDF, really slow to build HTML files.

> What I would like to know, however, is whether FOP runs on a Free(tm)
> Java implementation.  It seems that you need Sun's JDK to run it and
> that will be a problem if it's to become our primary build method.
>

I use Sun's JDK. I didn't try something else... I'm gonna try.


--
Guillaume.

Re: French PDF manual

From
Guillaume LELARGE
Date:
Peter Eisentraut a écrit :
> Bruce Momjian wrote:
>> Should we modify the doc build process to generate XML and convert
>> that to PDF, or allow it as an option, because PDF generation has
>> always been a problem from SGML (very slow)?
>
> If the FOP-based build works better than Jade, by all means let's use
> that.  Last time I looked at FOP (years ago, admittedly), it broke at
> about page 10.  Obviously, Guillaume and company are getting better
> results now.
>
> What I would like to know, however, is whether FOP runs on a Free(tm)
> Java implementation.  It seems that you need Sun's JDK to run it and
> that will be a problem if it's to become our primary build method.
>

I finally had time to work on this matter. It works just great with
libgcj. I just had to export JAVA_HOME with a new location and to
install Jimy (for image rendering). There's only one issue : I found it
quite slow.


Re: French PDF manual

From
Peter Eisentraut
Date:
Guillaume LELARGE wrote:
> I finally had time to work on this matter. It works just great with
> libgcj. I just had to export JAVA_HOME with a new location and to
> install Jimy (for image rendering). There's only one issue : I found
> it quite slow.

As long as it works, it's OK.

Would you submit the changes that you had to make to the source code to
get useful results?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: French PDF manual

From
Guillaume LELARGE
Date:
Peter Eisentraut a écrit :
> Guillaume LELARGE wrote:
>> I finally had time to work on this matter. It works just great with
>> libgcj. I just had to export JAVA_HOME with a new location and to
>> install Jimy (for image rendering). There's only one issue : I found
>> it quite slow.
>
> As long as it works, it's OK.
>
> Would you submit the changes that you had to make to the source code to
> get useful results?
>

Sorry Peter, I don't think I really understood what you meant. Which
changes are you talking about ? which source code ?


--
Guillaume.
<!-- http://abs.traduc.org/
     http://lfs.traduc.org/
     http://traduc.postgresqlfr.org/ -->

Re: French PDF manual

From
Peter Eisentraut
Date:
Am Donnerstag, 25. Mai 2006 01:31 schrieb Guillaume LELARGE:
> Sorry Peter, I don't think I really understood what you meant. Which
> changes are you talking about ? which source code ?

I thought that you had to make some changes to the documentation source code
to get it to work with fop.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: French PDF manual

From
"Guillaume Lelarge"
Date:
2006/5/26, Peter Eisentraut <peter_e@gmx.net>:
> Am Donnerstag, 25. Mai 2006 01:31 schrieb Guillaume LELARGE:
> > Sorry Peter, I don't think I really understood what you meant. Which
> > changes are you talking about ? which source code ?
>
> I thought that you had to make some changes to the documentation source code
> to get it to work with fop.
>

Yes, you're right. There's some changes to do on the documentation
source files. I can provide a diff which will transform SGML files
into valid XML files. Do you want me to work on REL8_1_STABLE or HEAD
branch ?

I can work on this next monday and tuesday.

Regards.


--
Guillaume.

Re: French PDF manual

From
Peter Eisentraut
Date:
Am Freitag, 26. Mai 2006 14:19 schrieb Guillaume Lelarge:
> Yes, you're right. There's some changes to do on the documentation
> source files. I can provide a diff which will transform SGML files
> into valid XML files.

Is that the only change?  Then we don't need it.

> Do you want me to work on REL8_1_STABLE or HEAD
> branch ?

HEAD branch always.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: French PDF manual

From
"Guillaume Lelarge"
Date:
2006/5/26, Peter Eisentraut <peter_e@gmx.net>:
> Am Freitag, 26. Mai 2006 14:19 schrieb Guillaume Lelarge:
> > Yes, you're right. There's some changes to do on the documentation
> > source files. I can provide a diff which will transform SGML files
> > into valid XML files.
>
> Is that the only change?  Then we don't need it.
>

I added some tags to specify columns' size
(http://www.oasis-open.org/docbook/documentation/reference/html/colspec.html
and http://www.sagehill.net/docbookxsl/ColumnWidths.html) and XSLT
stylesheets to build HTML, PDF, htmlhelp and manpages.

I also added some attributes to handle conditional texts
(http://www.sagehill.net/docbookxsl/Profiling.html#MarkProfileText),
needed to build the standalone install file.

> > Do you want me to work on REL8_1_STABLE or HEAD
> > branch ?
>
> HEAD branch always.
>

OK.


--
Guillaume.

Re: French PDF manual

From
Peter Eisentraut
Date:
Am Freitag, 26. Mai 2006 14:46 schrieb Guillaume Lelarge:
> I added some tags to specify columns' size
> (http://www.oasis-open.org/docbook/documentation/reference/html/colspec.htm
>l and http://www.sagehill.net/docbookxsl/ColumnWidths.html)

That's what we need, I guess.

>and XSLT
> stylesheets to build HTML, PDF, htmlhelp and manpages.

We already have that.

> I also added some attributes to handle conditional texts
> (http://www.sagehill.net/docbookxsl/Profiling.html#MarkProfileText),
> needed to build the standalone install file.

That will become necessary once we switch the source to XML.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: French PDF manual

From
"Guillaume Lelarge"
Date:
2006/5/26, Peter Eisentraut <peter_e@gmx.net>:
> Am Freitag, 26. Mai 2006 14:46 schrieb Guillaume Lelarge:
> > I added some tags to specify columns' size
> > (http://www.oasis-open.org/docbook/documentation/reference/html/colspec.htm
> >l and http://www.sagehill.net/docbookxsl/ColumnWidths.html)
>
> That's what we need, I guess.
>

Yes, it's required if we don't want to have some text of a cell
overwritten on the following cell.

> >and XSLT
> > stylesheets to build HTML, PDF, htmlhelp and manpages.
>
> We already have that.
>

OK.

> > I also added some attributes to handle conditional texts
> > (http://www.sagehill.net/docbookxsl/Profiling.html#MarkProfileText),
> > needed to build the standalone install file.
>
> That will become necessary once we switch the source to XML.
>

Yes.

So, you just need colspec tags ? it won't be easy to do because
there's a lot of checks to do after changes. But I can try.


--
Guillaume.

Re: French PDF manual

From
Mario
Date:
On 26/05/06, Guillaume Lelarge <guillaume.lelarge@gmail.com> wrote:
> 2006/5/26, Peter Eisentraut <peter_e@gmx.net>:
> > Am Freitag, 26. Mai 2006 14:19 schrieb Guillaume Lelarge:
> > > Yes, you're right. There's some changes to do on the documentation
> > > source files. I can provide a diff which will transform SGML files
> > > into valid XML files.
> >

   I've got the 8.1 postgres manuals in DocbookXML 4.2 format and I'm
using POT files for translations.  As soon as possible I'll have with
the html, xml, pot and a little status page avaliable.

  However, the xml files are not 100% valid but it works for me.

> > Is that the only change?  Then we don't need it.
> >
>
> I added some tags to specify columns' size
> (http://www.oasis-open.org/docbook/documentation/reference/html/colspec.html
> and http://www.sagehill.net/docbookxsl/ColumnWidths.html) and XSLT
> stylesheets to build HTML, PDF, htmlhelp and manpages.
>
> I also added some attributes to handle conditional texts
> (http://www.sagehill.net/docbookxsl/Profiling.html#MarkProfileText),
> needed to build the standalone install file.
>
> > > Do you want me to work on REL8_1_STABLE or HEAD
> > > branch ?
> >
> > HEAD branch always.
> >
>
> OK.
>