Thread: Moving documentation to XML

Moving documentation to XML

From
Luzanov Pavel
Date:

Peter,


I found this message in archives: 

http://www.postgresql.org/message-id/flat/519C3D99.9000304@gmx.net#519C3D99.9000304@gmx.net


and, as you recommend, tested a speed of building docs on a fresh Ubuntu installation.

50sec for make html
and 14min 50sec for make xslthtml
17 times slower!


Is it still a main stopper for moving to XML?


-----

Pavel Luzanov
pluzanov@postgresql.ru

Re: Moving documentation to XML

From
Peter Eisentraut
Date:
On 4/2/15 5:22 PM, Luzanov Pavel wrote:
> Peter,
>
>
> I found this message in archives:
>
> http://www.postgresql.org/message-id/flat/519C3D99.9000304@gmx.net#519C3D99.9000304@gmx.net
>
>
> and, as you recommend, tested a speed of building docs on a fresh Ubuntu
> installation.
>
> 50sec for make html
> and 14min 50sec for make xslthtml
> 17 times slower!
>
>
> Is it still a main stopper for moving to XML?

Yes :)



Re: Moving documentation to XML

From
Alexander Lakhin
Date:
Hello, Peter.

I've managed to speed up html generation from xml (make xslthtml) from
32 min. (in my environment) to 4 min. by modifying slowest XSL templates.
All my modifications incorporated in a single file
stylesheet-xhtml-speedup.xsl, which is included in stylesheet.xsl.
I performed optimization by analyzing output of:
xsltproc --profile --stringparam pg.version '9.6devel' stylesheet.xsl
postgres.xml
Initial statistics:
number               match                name      mode Calls Tot 100us Avg

     0             appendix label.markup
23090 90677526   3927
     1              chapter label.markup
28870 39740757   1376
     2 chunk-all-sections             1289 23845066  18498
     3                     make.legalnotice.head.links
2578 9630258   3735
     4            indexterm reference   2579 4126513   1600
     5 html.head             1289 3112534   2414
...
index % time    self  children    called     name
                 0.479 1326.034     22/23090     toc.line [61]
                 5.128 1308.245  21944/23090 sect1[label.markup] [13]
                 3.772 1318.264    850/23090 substitute-markup [15]
                 1.355 1304.631    274/23090
figure|table|example[label.markup] [32]
[0]    47.95  906.775    1.613  23090 appendix[label.markup] [0]
                 1.613    0.000  23090/23090 autolabel.format [29]
-----------------------------------------------
                 5.128 1308.245  24708/28870 sect1[label.markup] [13]
                 0.479 1326.034    130/28870     toc.line [61]
                 3.772 1318.264   2112/28870 substitute-markup [15]
                 1.355 1304.631   1920/28870
figure|table|example[label.markup] [32]
[1]    21.01  397.408    1.613  28870 chapter[label.markup] [1]
                 1.613    0.000  28870/28870 autolabel.format [29]
-----------------------------------------------
                 0.164  238.606   1289/1289 process-chunk-element [98]
[2]    12.61  238.451    0.225   1289 chunk-all-sections [2]
                 0.225    7.117   1289/1289 process-chunk [86]
-----------------------------------------------
                31.125  112.261   1289/2578      html.head [5]
                96.303   96.726   1289/2578 make.legalnotice.head.links [3]
[3]     5.09   96.303   96.726   2578 make.legalnotice.head.links [3]
                96.303   96.726   1289/3867 make.legalnotice.head.links [3]
                 0.339    0.494   1289/3867
*[object.title.markup.textonly] [69]
                 0.085    0.781   1289/3867 ln.or.rh.filename [116]

-----------------------------------------------


Currrent statistics:
number               match                name      mode Calls Tot 100us Avg

     0 chunk-all-sections             1289 5405958   4193
     1                     make.legalnotice.head.links
                                                            1289
3159538   2451
     2 html.head             1289 3068417   2380
     3                         gentext.template 689835 2327761      3
     4                            l10n.language 564453 1455253      2
     5 href.target            29881 1344063     44
---
index % time    self  children    called     name
                 0.136   54.207   1289/1289 process-chunk-element [95]
[0]    20.40   54.060    0.312   1289 chunk-all-sections [0]
                 0.312    6.468   1289/1289 process-chunk [67]
-----------------------------------------------
                30.684   45.458   1289/1289      html.head [2]
[1]    11.92   31.595    0.448   1289 make.legalnotice.head.links [1]
                 0.290    0.403   1289/2578
*[object.title.markup.textonly] [71]
                 0.159    0.828   1289/2578 ln.or.rh.filename [91]
-----------------------------------------------
                 0.330   31.617   1289/1289 chunk-element-content [65]
[2]    11.58   30.684   45.458   1289     html.head [2]
                31.595    0.448   1289/15462 make.legalnotice.head.links [1]
                13.441    4.726   5153/15462 href.target [5]
                 0.290    0.403   5153/15462
*[object.title.markup.textonly] [71]
                 0.115    1.576   1289/15462 head.content [99]
                 0.012    0.000   1289/15462 system.head.content [186]
                 0.006    0.000   1289/15462 user.head.content [228]

To make sure that result of the transformation is the same, I've
compared original .html's with .html's generated with modified templates.
Unfortunately xslt generates random id's, so it's needed to exclude them
before comparing. I do that with:
for f in */*.html; do sed -e
's/id=\"\(ftn\.\)\?id[a-z][0-9]\+\"/id=\"id\"/g' -i $f ; sed -e
's/href=\"[^#]*#\(ftn\.\)\?id[a-z][0-9]\+\"/href=\"#\"/g' -i $f; done


So if it's acceptable way to speed up generation of HTML (and maybe some
other formats), what other steps should we take to move away from SGML?
If the performance is still not satisfying, please let me know, I'll
continue to optimize xslt.
Beside performance issues, I can see some difference in results of 'make
html' and 'make xslthtml'. For example, see doc/src/sgml/html/spi.html
(xslt-generated version doesn't contain the lists of functions).

Best regards,
Alexander



06.04.2015 23:02, Peter Eisentraut wrote:
> On 4/2/15 5:22 PM, Luzanov Pavel wrote:
>> Peter,
>>
>>
>> I found this message in archives:
>>
>> http://www.postgresql.org/message-id/flat/519C3D99.9000304@gmx.net#519C3D99.9000304@gmx.net
>>
>>
>> and, as you recommend, tested a speed of building docs on a fresh Ubuntu
>> installation.
>>
>> 50sec for make html
>> and 14min 50sec for make xslthtml
>> 17 times slower!
>>
>>
>> Is it still a main stopper for moving to XML?
> Yes :)
>
>
>


Attachment

Re: Moving documentation to XML

From
Guillaume Lelarge
Date:

Le 26 oct. 2015 6:40 PM, "Alexander Lakhin" <a.lakhin@postgrespro.ru> a écrit :
>
> Hello, Peter.
>
> I've managed to speed up html generation from xml (make xslthtml) from 32 min. (in my environment) to 4 min. by modifying slowest XSL templates.
> All my modifications incorporated in a single file stylesheet-xhtml-speedup.xsl, which is included in stylesheet.xsl.
> I performed optimization by analyzing output of:
> xsltproc --profile --stringparam pg.version '9.6devel' stylesheet.xsl postgres.xml
> Initial statistics:
> number               match                name      mode Calls Tot 100us Avg
>
>     0             appendix label.markup
> 23090 90677526   3927
>     1              chapter label.markup
> 28870 39740757   1376
>     2 chunk-all-sections             1289 23845066  18498
>     3                     make.legalnotice.head.links
> 2578 9630258   3735
>     4            indexterm reference   2579 4126513   1600
>     5 html.head             1289 3112534   2414
> ...
> index % time    self  children    called     name
>                 0.479 1326.034     22/23090     toc.line [61]
>                 5.128 1308.245  21944/23090 sect1[label.markup] [13]
>                 3.772 1318.264    850/23090 substitute-markup [15]
>                 1.355 1304.631    274/23090 figure|table|example[label.markup] [32]
> [0]    47.95  906.775    1.613  23090 appendix[label.markup] [0]
>                 1.613    0.000  23090/23090 autolabel.format [29]
> -----------------------------------------------
>                 5.128 1308.245  24708/28870 sect1[label.markup] [13]
>                 0.479 1326.034    130/28870     toc.line [61]
>                 3.772 1318.264   2112/28870 substitute-markup [15]
>                 1.355 1304.631   1920/28870 figure|table|example[label.markup] [32]
> [1]    21.01  397.408    1.613  28870 chapter[label.markup] [1]
>                 1.613    0.000  28870/28870 autolabel.format [29]
> -----------------------------------------------
>                 0.164  238.606   1289/1289 process-chunk-element [98]
> [2]    12.61  238.451    0.225   1289 chunk-all-sections [2]
>                 0.225    7.117   1289/1289 process-chunk [86]
> -----------------------------------------------
>                31.125  112.261   1289/2578      html.head [5]
>                96.303   96.726   1289/2578 make.legalnotice.head.links [3]
> [3]     5.09   96.303   96.726   2578 make.legalnotice.head.links [3]
>                96.303   96.726   1289/3867 make.legalnotice.head.links [3]
>                 0.339    0.494   1289/3867 *[object.title.markup.textonly] [69]
>                 0.085    0.781   1289/3867 ln.or.rh.filename [116]
>
> -----------------------------------------------
>
>
> Currrent statistics:
> number               match                name      mode Calls Tot 100us Avg
>
>     0 chunk-all-sections             1289 5405958   4193
>     1                     make.legalnotice.head.links
>                                                            1289 3159538   2451
>     2 html.head             1289 3068417   2380
>     3                         gentext.template 689835 2327761      3
>     4                            l10n.language 564453 1455253      2
>     5 href.target            29881 1344063     44
> ---
> index % time    self  children    called     name
>                 0.136   54.207   1289/1289 process-chunk-element [95]
> [0]    20.40   54.060    0.312   1289 chunk-all-sections [0]
>                 0.312    6.468   1289/1289 process-chunk [67]
> -----------------------------------------------
>                30.684   45.458   1289/1289      html.head [2]
> [1]    11.92   31.595    0.448   1289 make.legalnotice.head.links [1]
>                 0.290    0.403   1289/2578 *[object.title.markup.textonly] [71]
>                 0.159    0.828   1289/2578 ln.or.rh.filename [91]
> -----------------------------------------------
>                 0.330   31.617   1289/1289 chunk-element-content [65]
> [2]    11.58   30.684   45.458   1289     html.head [2]
>                31.595    0.448   1289/15462 make.legalnotice.head.links [1]
>                13.441    4.726   5153/15462 href.target [5]
>                 0.290    0.403   5153/15462 *[object.title.markup.textonly] [71]
>                 0.115    1.576   1289/15462 head.content [99]
>                 0.012    0.000   1289/15462 system.head.content [186]
>                 0.006    0.000   1289/15462 user.head.content [228]
>
> To make sure that result of the transformation is the same, I've compared original .html's with .html's generated with modified templates.
> Unfortunately xslt generates random id's, so it's needed to exclude them before comparing. I do that with:
> for f in */*.html; do sed -e 's/id=\"\(ftn\.\)\?id[a-z][0-9]\+\"/id=\"id\"/g' -i $f ; sed -e 's/href=\"[^#]*#\(ftn\.\)\?id[a-z][0-9]\+\"/href=\"#\"/g' -i $f; done
>
>
> So if it's acceptable way to speed up generation of HTML (and maybe some other formats), what other steps should we take to move away from SGML?
> If the performance is still not satisfying, please let me know, I'll continue to optimize xslt.
> Beside performance issues, I can see some difference in results of 'make html' and 'make xslthtml'. For example, see doc/src/sgml/html/spi.html (xslt-generated version doesn't contain the lists of functions).
>

What you've done is awesome. I can't wait to test it on the french translation.

Nice work!

Re: Moving documentation to XML

From
Alexander Lakhin
Date:
Hello, Guillaume.

We have plans to use this for russian translation, too. We translate the
docs by converting (with xml2po) the single xml to postgres-ru.po and
after translating it we convert it back to xml (we get postres-ru.xml
here).  (Until now we had to perform one more conversion
(postgres-ru.xml -> set of sgml's).)
So now we can get russian html/* with:
python xml2po.py -l ru -k -p postgres-ru.po postgres.xml >postgres-ru.xml
xsltproc --stringparam pg.version '9.4.1'  stylesheet.xsl postgres-ru.xml

But I had some doubts about DSSSL and XSL differences. As I noted
previously there was at least one visible difference. So I decided to
customize XSL templates to make sure that html's are generated without a
loss or corruption.
I thought that comparing two HTML sources will not work, as they are too
different, but maybe we can compare text generated from html by lynx,
for example.
So I use the following procedure to look for differences:
0. Get dsssl-generated html's
make html
1. Extract text content from html's:
  for f in html/*.html; do fn=`basename $f`; echo $fn; cat $f | perl
-0pi -pe 's/<B\s*>Note:\s*<\/B\s*>/\<h3>Note<\/h3>/g' | perl -0pi -pe
's/><BLOCKQUOTE\s*CLASS="NOTE"/><div/ig' >/tmp/$fn; lynx /tmp/$fn --dump
 >html-text/$fn;
* Some differences are not significant so it's not reasonable to modify
XSL templates to eliminate them. Difference in "Note" placement and
spelling is one of them, so I just filter it out.
2. Rename html to html-o and html-text to html-o-text.
3. Generate html's with XSL (use modified templates):
rm -r html; xsltproc --stringparam pg.version '9.4.1' stylesheet.xsl
postgres.xml
4. Extract text content from html's as above.
5. Make sure that two text html's are identical:
diff -s -u -b -I '^\s*_\+\s*$' html-o-text/xtypes.html html-text/xtypes.html
* Differences in whitespaces and length of "____" lines are not
significant, too.

For now, I've managed to get the same xtypes.html (I tested my XSL
customizations with it), but I think, we can eliminate other most
outstanding (or maybe all) differences likewise.
I can describe XSL customizations in more details, if needed.

Best regards,
Alexander

P.S. I couldn't post the message as a reply due to error on the
postgresql.org side.
(<pgsql-docs@postgresql.org>: host makus.postgresql.org[174.143.35.229]
said:
     550 Message headers fail syntax check (in reply to end of DATA
command))


28.10.2015 14:46, Guillaume Lelarge wrote:
>
> Le 26 oct. 2015 6:40 PM, "Alexander Lakhin" <a.lakhin@postgrespro.ru>
> a écrit :
> >
> ...
> > To make sure that result of the transformation is the same, I've
> compared original .html's with .html's generated with modified templates.
> > Unfortunately xslt generates random id's, so it's needed to exclude
> them before comparing. I do that with:
> > for f in */*.html; do sed -e
> 's/id=\"\(ftn\.\)\?id[a-z][0-9]\+\"/id=\"id\"/g' -i $f ; sed -e
> 's/href=\"[^#]*#\(ftn\.\)\?id[a-z][0-9]\+\"/href=\"#\"/g' -i $f; done
> >
> >
> > So if it's acceptable way to speed up generation of HTML (and maybe
> some other formats), what other steps should we take to move away from
> SGML?
> > If the performance is still not satisfying, please let me know, I'll
> continue to optimize xslt.
> > Beside performance issues, I can see some difference in results of
> 'make html' and 'make xslthtml'. For example, see
> doc/src/sgml/html/spi.html (xslt-generated version doesn't contain the
> lists of functions).
> >
>
> What you've done is awesome. I can't wait to test it on the french
> translation.
>
> Nice work!
>


Attachment

Re: Moving documentation to XML

From
Stefan Kaltenbrunner
Date:
On 10/30/2015 02:40 PM, Alexander Lakhin wrote:
> Hello, Guillaume.
>
> We have plans to use this for russian translation, too. We translate the
> docs by converting (with xml2po) the single xml to postgres-ru.po and
> after translating it we convert it back to xml (we get postres-ru.xml
> here).  (Until now we had to perform one more conversion
> (postgres-ru.xml -> set of sgml's).)
> So now we can get russian html/* with:
> python xml2po.py -l ru -k -p postgres-ru.po postgres.xml >postgres-ru.xml
> xsltproc --stringparam pg.version '9.4.1'  stylesheet.xsl postgres-ru.xml
>
> But I had some doubts about DSSSL and XSL differences. As I noted
> previously there was at least one visible difference. So I decided to
> customize XSL templates to make sure that html's are generated without a
> loss or corruption.
> I thought that comparing two HTML sources will not work, as they are too
> different, but maybe we can compare text generated from html by lynx,
> for example.
> So I use the following procedure to look for differences:
> 0. Get dsssl-generated html's
> make html
> 1. Extract text content from html's:
>  for f in html/*.html; do fn=`basename $f`; echo $fn; cat $f | perl -0pi
> -pe 's/<B\s*>Note:\s*<\/B\s*>/\<h3>Note<\/h3>/g' | perl -0pi -pe
> 's/><BLOCKQUOTE\s*CLASS="NOTE"/><div/ig' >/tmp/$fn; lynx /tmp/$fn --dump
>>html-text/$fn;
> * Some differences are not significant so it's not reasonable to modify
> XSL templates to eliminate them. Difference in "Note" placement and
> spelling is one of them, so I just filter it out.
> 2. Rename html to html-o and html-text to html-o-text.
> 3. Generate html's with XSL (use modified templates):
> rm -r html; xsltproc --stringparam pg.version '9.4.1' stylesheet.xsl
> postgres.xml
> 4. Extract text content from html's as above.
> 5. Make sure that two text html's are identical:
> diff -s -u -b -I '^\s*_\+\s*$' html-o-text/xtypes.html
> html-text/xtypes.html
> * Differences in whitespaces and length of "____" lines are not
> significant, too.
>
> For now, I've managed to get the same xtypes.html (I tested my XSL
> customizations with it), but I think, we can eliminate other most
> outstanding (or maybe all) differences likewise.
> I can describe XSL customizations in more details, if needed.
>
> Best regards,
> Alexander
>
> P.S. I couldn't post the message as a reply due to error on the
> postgresql.org side.
> (<pgsql-docs@postgresql.org>: host makus.postgresql.org[174.143.35.229]
> said:
>     550 Message headers fail syntax check (in reply to end of DATA
> command))

Sorry for not replying earlier but most of the sysadmin team is not
tracking pgsql-docs that closely for issues - afaiks there was a typo in
your mail - the "To" in your mail looked like this:


To: pgsql-docs@postgresql org <pgsql-docs@postgresql.org>
  References: <1428009501118.85114@postgrespro.ru>
<5522E656.4060201@gmx.net>
 <562E061B.1090809@postgrespro.ru>
 <CAECtzeWiOkS=wVnk4T+4Bg3-z-5DGL09jp7ks5QAiDEO4d10+Q@mail.gmail.com>


notice that there is a space after "pgsql-docs@postgresql" instead of
what I suspect should be a "." causing the header syntax check in exim
to barf on the mail.



Stefan


Re: Moving documentation to XML

From
Oleg Bartunov
Date:


On Mon, Oct 26, 2015 at 11:53 AM, Alexander Lakhin <a.lakhin@postgrespro.ru> wrote:
Hello, Peter.

I've managed to speed up html generation from xml (make xslthtml) from 32 min. (in my environment) to 4 min. by modifying slowest XSL templates.
All my modifications incorporated in a single file stylesheet-xhtml-speedup.xsl, which is included in stylesheet.xsl.
I performed optimization by analyzing output of:
xsltproc --profile --stringparam pg.version '9.6devel' stylesheet.xsl postgres.xml
Initial statistics:
number               match                name      mode Calls Tot 100us Avg

    0             appendix label.markup
23090 90677526   3927
    1              chapter label.markup
28870 39740757   1376
    2 chunk-all-sections             1289 23845066  18498
    3                     make.legalnotice.head.links
2578 9630258   3735
    4            indexterm reference   2579 4126513   1600
    5 html.head             1289 3112534   2414
...
index % time    self  children    called     name
                0.479 1326.034     22/23090     toc.line [61]
                5.128 1308.245  21944/23090 sect1[label.markup] [13]
                3.772 1318.264    850/23090 substitute-markup [15]
                1.355 1304.631    274/23090 figure|table|example[label.markup] [32]
[0]    47.95  906.775    1.613  23090 appendix[label.markup] [0]
                1.613    0.000  23090/23090 autolabel.format [29]
-----------------------------------------------
                5.128 1308.245  24708/28870 sect1[label.markup] [13]
                0.479 1326.034    130/28870     toc.line [61]
                3.772 1318.264   2112/28870 substitute-markup [15]
                1.355 1304.631   1920/28870 figure|table|example[label.markup] [32]
[1]    21.01  397.408    1.613  28870 chapter[label.markup] [1]
                1.613    0.000  28870/28870 autolabel.format [29]
-----------------------------------------------
                0.164  238.606   1289/1289 process-chunk-element [98]
[2]    12.61  238.451    0.225   1289 chunk-all-sections [2]
                0.225    7.117   1289/1289 process-chunk [86]
-----------------------------------------------
               31.125  112.261   1289/2578      html.head [5]
               96.303   96.726   1289/2578 make.legalnotice.head.links [3]
[3]     5.09   96.303   96.726   2578 make.legalnotice.head.links [3]
               96.303   96.726   1289/3867 make.legalnotice.head.links [3]
                0.339    0.494   1289/3867 *[object.title.markup.textonly] [69]
                0.085    0.781   1289/3867 ln.or.rh.filename [116]

-----------------------------------------------


Currrent statistics:
number               match                name      mode Calls Tot 100us Avg

    0 chunk-all-sections             1289 5405958   4193
    1                     make.legalnotice.head.links
                                                           1289 3159538   2451
    2 html.head             1289 3068417   2380
    3                         gentext.template 689835 2327761      3
    4                            l10n.language 564453 1455253      2
    5 href.target            29881 1344063     44
---
index % time    self  children    called     name
                0.136   54.207   1289/1289 process-chunk-element [95]
[0]    20.40   54.060    0.312   1289 chunk-all-sections [0]
                0.312    6.468   1289/1289 process-chunk [67]
-----------------------------------------------
               30.684   45.458   1289/1289      html.head [2]
[1]    11.92   31.595    0.448   1289 make.legalnotice.head.links [1]
                0.290    0.403   1289/2578 *[object.title.markup.textonly] [71]
                0.159    0.828   1289/2578 ln.or.rh.filename [91]
-----------------------------------------------
                0.330   31.617   1289/1289 chunk-element-content [65]
[2]    11.58   30.684   45.458   1289     html.head [2]
               31.595    0.448   1289/15462 make.legalnotice.head.links [1]
               13.441    4.726   5153/15462 href.target [5]
                0.290    0.403   5153/15462 *[object.title.markup.textonly] [71]
                0.115    1.576   1289/15462 head.content [99]
                0.012    0.000   1289/15462 system.head.content [186]
                0.006    0.000   1289/15462 user.head.content [228]

To make sure that result of the transformation is the same, I've compared original .html's with .html's generated with modified templates.
Unfortunately xslt generates random id's, so it's needed to exclude them before comparing. I do that with:
for f in */*.html; do sed -e 's/id=\"\(ftn\.\)\?id[a-z][0-9]\+\"/id=\"id\"/g' -i $f ; sed -e 's/href=\"[^#]*#\(ftn\.\)\?id[a-z][0-9]\+\"/href=\"#\"/g' -i $f; done


So if it's acceptable way to speed up generation of HTML (and maybe some other formats), what other steps should we take to move away from SGML?
If the performance is still not satisfying, please let me know, I'll continue to optimize xslt.
Beside performance issues, I can see some difference in results of 'make html' and 'make xslthtml'. For example, see doc/src/sgml/html/spi.html (xslt-generated version doesn't contain the lists of functions).

Best regards,
Alexander

I think this is great result  and it's worth to start moving to xml. I want to note, that it's 21-th century  and we should think about including pictures into our documentation, which will greatly improve it. XML makes this easier.

 




06.04.2015 23:02, Peter Eisentraut wrote:
On 4/2/15 5:22 PM, Luzanov Pavel wrote:
Peter,


I found this message in archives:

http://www.postgresql.org/message-id/flat/519C3D99.9000304@gmx.net#519C3D99.9000304@gmx.net


and, as you recommend, tested a speed of building docs on a fresh Ubuntu
installation.

50sec for make html
and 14min 50sec for make xslthtml
17 times slower!


Is it still a main stopper for moving to XML?
Yes :)






--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs


Re: Moving documentation to XML

From
Dmitry Igrishin
Date:


2015-11-03 20:46 GMT+03:00 Oleg Bartunov <obartunov@gmail.com>:


On Mon, Oct 26, 2015 at 11:53 AM, Alexander Lakhin <a.lakhin@postgrespro.ru> wrote:
Hello, Peter.

I've managed to speed up html generation from xml (make xslthtml) from 32 min. (in my environment) to 4 min. by modifying slowest XSL templates.
All my modifications incorporated in a single file stylesheet-xhtml-speedup.xsl, which is included in stylesheet.xsl.
I performed optimization by analyzing output of:
xsltproc --profile --stringparam pg.version '9.6devel' stylesheet.xsl postgres.xml
Initial statistics:
number               match                name      mode Calls Tot 100us Avg

    0             appendix label.markup
23090 90677526   3927
    1              chapter label.markup
28870 39740757   1376
    2 chunk-all-sections             1289 23845066  18498
    3                     make.legalnotice.head.links
2578 9630258   3735
    4            indexterm reference   2579 4126513   1600
    5 html.head             1289 3112534   2414
...
index % time    self  children    called     name
                0.479 1326.034     22/23090     toc.line [61]
                5.128 1308.245  21944/23090 sect1[label.markup] [13]
                3.772 1318.264    850/23090 substitute-markup [15]
                1.355 1304.631    274/23090 figure|table|example[label.markup] [32]
[0]    47.95  906.775    1.613  23090 appendix[label.markup] [0]
                1.613    0.000  23090/23090 autolabel.format [29]
-----------------------------------------------
                5.128 1308.245  24708/28870 sect1[label.markup] [13]
                0.479 1326.034    130/28870     toc.line [61]
                3.772 1318.264   2112/28870 substitute-markup [15]
                1.355 1304.631   1920/28870 figure|table|example[label.markup] [32]
[1]    21.01  397.408    1.613  28870 chapter[label.markup] [1]
                1.613    0.000  28870/28870 autolabel.format [29]
-----------------------------------------------
                0.164  238.606   1289/1289 process-chunk-element [98]
[2]    12.61  238.451    0.225   1289 chunk-all-sections [2]
                0.225    7.117   1289/1289 process-chunk [86]
-----------------------------------------------
               31.125  112.261   1289/2578      html.head [5]
               96.303   96.726   1289/2578 make.legalnotice.head.links [3]
[3]     5.09   96.303   96.726   2578 make.legalnotice.head.links [3]
               96.303   96.726   1289/3867 make.legalnotice.head.links [3]
                0.339    0.494   1289/3867 *[object.title.markup.textonly] [69]
                0.085    0.781   1289/3867 ln.or.rh.filename [116]

-----------------------------------------------


Currrent statistics:
number               match                name      mode Calls Tot 100us Avg

    0 chunk-all-sections             1289 5405958   4193
    1                     make.legalnotice.head.links
                                                           1289 3159538   2451
    2 html.head             1289 3068417   2380
    3                         gentext.template 689835 2327761      3
    4                            l10n.language 564453 1455253      2
    5 href.target            29881 1344063     44
---
index % time    self  children    called     name
                0.136   54.207   1289/1289 process-chunk-element [95]
[0]    20.40   54.060    0.312   1289 chunk-all-sections [0]
                0.312    6.468   1289/1289 process-chunk [67]
-----------------------------------------------
               30.684   45.458   1289/1289      html.head [2]
[1]    11.92   31.595    0.448   1289 make.legalnotice.head.links [1]
                0.290    0.403   1289/2578 *[object.title.markup.textonly] [71]
                0.159    0.828   1289/2578 ln.or.rh.filename [91]
-----------------------------------------------
                0.330   31.617   1289/1289 chunk-element-content [65]
[2]    11.58   30.684   45.458   1289     html.head [2]
               31.595    0.448   1289/15462 make.legalnotice.head.links [1]
               13.441    4.726   5153/15462 href.target [5]
                0.290    0.403   5153/15462 *[object.title.markup.textonly] [71]
                0.115    1.576   1289/15462 head.content [99]
                0.012    0.000   1289/15462 system.head.content [186]
                0.006    0.000   1289/15462 user.head.content [228]

To make sure that result of the transformation is the same, I've compared original .html's with .html's generated with modified templates.
Unfortunately xslt generates random id's, so it's needed to exclude them before comparing. I do that with:
for f in */*.html; do sed -e 's/id=\"\(ftn\.\)\?id[a-z][0-9]\+\"/id=\"id\"/g' -i $f ; sed -e 's/href=\"[^#]*#\(ftn\.\)\?id[a-z][0-9]\+\"/href=\"#\"/g' -i $f; done


So if it's acceptable way to speed up generation of HTML (and maybe some other formats), what other steps should we take to move away from SGML?
If the performance is still not satisfying, please let me know, I'll continue to optimize xslt.
Beside performance issues, I can see some difference in results of 'make html' and 'make xslthtml'. For example, see doc/src/sgml/html/spi.html (xslt-generated version doesn't contain the lists of functions).

Best regards,
Alexander

I think this is great result  and it's worth to start moving to xml.
I think that moving to XML is step backward, because XML is ugly.
I want to note, that it's 21-th century  and we should think about including pictures into our documentation, which will greatly improve it.
Yeah, +1.
XML makes this easier.
And I think that Lisp is much better for this  puprose.

--
// Dmitry.

Re: Moving documentation to XML

From
Tair Sabirgaliev
Date:
Then there is asciidoc..
On Ср, 4 нояб. 2015 г. at 2:11 Dmitry Igrishin <dmitigr@gmail.com> wrote:
2015-11-03 20:46 GMT+03:00 Oleg Bartunov <obartunov@gmail.com>:


On Mon, Oct 26, 2015 at 11:53 AM, Alexander Lakhin <a.lakhin@postgrespro.ru> wrote:
Hello, Peter.

I've managed to speed up html generation from xml (make xslthtml) from 32 min. (in my environment) to 4 min. by modifying slowest XSL templates.
All my modifications incorporated in a single file stylesheet-xhtml-speedup.xsl, which is included in stylesheet.xsl.
I performed optimization by analyzing output of:
xsltproc --profile --stringparam pg.version '9.6devel' stylesheet.xsl postgres.xml
Initial statistics:
number               match                name      mode Calls Tot 100us Avg

    0             appendix label.markup
23090 90677526   3927
    1              chapter label.markup
28870 39740757   1376
    2 chunk-all-sections             1289 23845066  18498
    3                     make.legalnotice.head.links
2578 9630258   3735
    4            indexterm reference   2579 4126513   1600
    5 html.head             1289 3112534   2414
...
index % time    self  children    called     name
                0.479 1326.034     22/23090     toc.line [61]
                5.128 1308.245  21944/23090 sect1[label.markup] [13]
                3.772 1318.264    850/23090 substitute-markup [15]
                1.355 1304.631    274/23090 figure|table|example[label.markup] [32]
[0]    47.95  906.775    1.613  23090 appendix[label.markup] [0]
                1.613    0.000  23090/23090 autolabel.format [29]
-----------------------------------------------
                5.128 1308.245  24708/28870 sect1[label.markup] [13]
                0.479 1326.034    130/28870     toc.line [61]
                3.772 1318.264   2112/28870 substitute-markup [15]
                1.355 1304.631   1920/28870 figure|table|example[label.markup] [32]
[1]    21.01  397.408    1.613  28870 chapter[label.markup] [1]
                1.613    0.000  28870/28870 autolabel.format [29]
-----------------------------------------------
                0.164  238.606   1289/1289 process-chunk-element [98]
[2]    12.61  238.451    0.225   1289 chunk-all-sections [2]
                0.225    7.117   1289/1289 process-chunk [86]
-----------------------------------------------
               31.125  112.261   1289/2578      html.head [5]
               96.303   96.726   1289/2578 make.legalnotice.head.links [3]
[3]     5.09   96.303   96.726   2578 make.legalnotice.head.links [3]
               96.303   96.726   1289/3867 make.legalnotice.head.links [3]
                0.339    0.494   1289/3867 *[object.title.markup.textonly] [69]
                0.085    0.781   1289/3867 ln.or.rh.filename [116]

-----------------------------------------------


Currrent statistics:
number               match                name      mode Calls Tot 100us Avg

    0 chunk-all-sections             1289 5405958   4193
    1                     make.legalnotice.head.links
                                                           1289 3159538   2451
    2 html.head             1289 3068417   2380
    3                         gentext.template 689835 2327761      3
    4                            l10n.language 564453 1455253      2
    5 href.target            29881 1344063     44
---
index % time    self  children    called     name
                0.136   54.207   1289/1289 process-chunk-element [95]
[0]    20.40   54.060    0.312   1289 chunk-all-sections [0]
                0.312    6.468   1289/1289 process-chunk [67]
-----------------------------------------------
               30.684   45.458   1289/1289      html.head [2]
[1]    11.92   31.595    0.448   1289 make.legalnotice.head.links [1]
                0.290    0.403   1289/2578 *[object.title.markup.textonly] [71]
                0.159    0.828   1289/2578 ln.or.rh.filename [91]
-----------------------------------------------
                0.330   31.617   1289/1289 chunk-element-content [65]
[2]    11.58   30.684   45.458   1289     html.head [2]
               31.595    0.448   1289/15462 make.legalnotice.head.links [1]
               13.441    4.726   5153/15462 href.target [5]
                0.290    0.403   5153/15462 *[object.title.markup.textonly] [71]
                0.115    1.576   1289/15462 head.content [99]
                0.012    0.000   1289/15462 system.head.content [186]
                0.006    0.000   1289/15462 user.head.content [228]

To make sure that result of the transformation is the same, I've compared original .html's with .html's generated with modified templates.
Unfortunately xslt generates random id's, so it's needed to exclude them before comparing. I do that with:
for f in */*.html; do sed -e 's/id=\"\(ftn\.\)\?id[a-z][0-9]\+\"/id=\"id\"/g' -i $f ; sed -e 's/href=\"[^#]*#\(ftn\.\)\?id[a-z][0-9]\+\"/href=\"#\"/g' -i $f; done


So if it's acceptable way to speed up generation of HTML (and maybe some other formats), what other steps should we take to move away from SGML?
If the performance is still not satisfying, please let me know, I'll continue to optimize xslt.
Beside performance issues, I can see some difference in results of 'make html' and 'make xslthtml'. For example, see doc/src/sgml/html/spi.html (xslt-generated version doesn't contain the lists of functions).

Best regards,
Alexander

I think this is great result  and it's worth to start moving to xml.
I think that moving to XML is step backward, because XML is ugly.
I want to note, that it's 21-th century  and we should think about including pictures into our documentation, which will greatly improve it.
Yeah, +1.
XML makes this easier.
And I think that Lisp is much better for this  puprose.

--
// Dmitry.