Re: Getting our tables to render better in PDF output - Mailing list pgsql-docs

From Alexander Lakhin
Subject Re: Getting our tables to render better in PDF output
Date
Msg-id e794bbd7-32f2-157d-8c7d-7bbfe4262e81@gmail.com
Whole thread Raw
In response to Re: Getting our tables to render better in PDF output  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-docs
Hello Alvaro,
14.02.2020 23:16, Alvaro Herrera wrote:
On 2020-Feb-13, Alexander Lakhin wrote:

Yes, I was starting with manual &zwsp; insertions into the translation,
but later I reduced such insertions just to several dozens. (For
example, we still have "3.1415926535&zwsp;8979323846" in the translation.)
The main issue of the manual approach was that I needed to recheck that
zwsp placement on updates, and I can't see where it's desired until I
generate pdf. Fortunately, fop prints warning like that:
[WARN] FOUserAgent - The contents of fo:block line 2 exceed the
available area in the inline-progression direction by 22725 millipoints.
(See position 127769:983)
It's not very user-friendly, but still useful when we have a pair or two
of them.
It seems to me that a productive way forward would be to fix the layout
to make these warning disappear. Then it will be relatively easy to find
where to fix, if new ones appear.

Now I suppose you're complaining about the "position 127769:983" part of
the error message which tells you with zero clarity where the problem
is.  Maybe what we need is to figure out what the numbers mean, and how
to use them; for example if they are byte offsets into the file, then it
should be possible to tell your editor to go to that byte in the
complete XML file.
I'm not complaining about the cryptic position of the problems, I'm concerned with their number.
The position is specified as {line_number}:{character_postition} in postgres-*.fo (not in the DocBook source).
For example, when performing `make postgres-A4.pdf` on REL_12_STABLE I get:
[WARN] FOUserAgent - The contents of fo:block line 1 exceed the available area in the inline-progression direction by more than 50 points. (See position 28808:374)

To find an exact problematic text you can look at the specified line(s) of postgres-A4.fo:
$ sed -n '28808,28811p' postgres-A4.fo
<fo:block id="id-1.5.13.4.7.12.1" wrap-option="wrap" text-align="start" space-before.minimum="0.8em" space-before.optimum="1em" space-before.maximum="1.2em" space-after.minimum="0.8em" space-after.optimum="1em" space-after.maximum="1.2em" hyphenate="false" white-space-collapse="false" white-space-treatment="preserve" linefeed-treatment="preserve" font-family="monospace">
EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 100;

Searching this text in pdf gets you to page 467 where you can see a long line of '---' going of the page...
Third (minor) issue is with translation - when I will see some break in
the English source, e.g. "split_part('abc~@~def&zwsp;~@~ghi', '~@~',
2)", should I leave the break in the same place, or it's better to move
it because adjacent text has different length and the table columns have
different width?
If the English version is warning-clean, then it should be possible to
keep the zwsps in the same location in the translation, and then tweak
the translation according to any new warnings that appear there.
My guess is that the majority of zwsps are going to want to stay in the
same place.
Yes, that's why I consider this as minor issue, but some kind of an automatic solution can eliminate it at all.
Maybe some of the rules can be implemented explicitly in the DocBook
source, just to reduce tons of zwsp in the generated output, or the
"fo:table-cell/fo:block//text()" condition can be improved to filter
some (text-only?) tables out, but I think that the idea of our specific
line breaking rules could work.
Maybe we can mark-up specific table cells/columns as being subject to
the special line breaking rules.
Things made complicated by the xslt preprocessor, because you can't see Docbook tags and attributes on a FOP level, but I can explore possible resolutions if we choose to go this way.

Best regards,
Alexander

pgsql-docs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Getting our tables to render better in PDF output
Next
From: Tom Lane
Date:
Subject: Re: Getting our tables to render better in PDF output