Thread: Re: [HACKERS] [GENERAL] Postgres 10 manual breaks links with anchors

Re: [HACKERS] [GENERAL] Postgres 10 manual breaks links with anchors

From
Peter Eisentraut
Date:
On 10/16/17 03:19, Thomas Kellerer wrote:
> I don't know if this is intentional, but the Postgres 10 manual started to use lowercase IDs as anchors in the
manual.
> 
> So, if I have e.g.: the following URL open in my browser:
> 
>    https://www.postgresql.org/docs/current/static/sql-createindex.html#sql-createindex-concurrently
> 
> I cannot simply switch to an older version by replacing "current" with e.g. "9.5" because in the 9.5 manual the
anchorwas all uppercase, and the URL would need to be: 
 
> 
>    https://www.postgresql.org/docs/9.5/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY
> 
> Is this intentional? 
> 
> This also makes "cleaning" up links in e.g. StackOverflow that point to outdated versions of the manual a bit more
cumbersome.
 

Here is a patch that can be applied to PG 10 to put the upper case
anchors back.

The question perhaps is whether we want to maintain this patch
indefinitely, or whether a clean break is better.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] [GENERAL] Postgres 10 manual breaks links with anchors

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 10/16/17 03:19, Thomas Kellerer wrote:
>> I don't know if this is intentional, but the Postgres 10 manual started to use lowercase IDs as anchors in the
manual.

> Here is a patch that can be applied to PG 10 to put the upper case
> anchors back.
> The question perhaps is whether we want to maintain this patch
> indefinitely, or whether a clean break is better.

In view of commit 1ff01b390, aren't we more or less locked into
lower-case anchors going forward?  I'm not sure I see the point
of changing v10 back to the old way if v11 will be incompatible
anyhow.
        regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Postgres 10 manual breaks links with anchors

From
Peter Eisentraut
Date:
On 10/26/17 16:10, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
>> On 10/16/17 03:19, Thomas Kellerer wrote:
>>> I don't know if this is intentional, but the Postgres 10 manual started to use lowercase IDs as anchors in the
manual.
> 
>> Here is a patch that can be applied to PG 10 to put the upper case
>> anchors back.
>> The question perhaps is whether we want to maintain this patch
>> indefinitely, or whether a clean break is better.
> 
> In view of commit 1ff01b390, aren't we more or less locked into
> lower-case anchors going forward?  I'm not sure I see the point
> of changing v10 back to the old way if v11 will be incompatible
> anyhow.

The details are more complicated.

The IDs in DocBook documents have two purposes.

One is to ensure non-broken links between things like <sect1 id="foo">
and <xref linkend="foo">.  This is set up in the DTD and checked during
parsing (validation, more precisely).  In DocBook SGML, many things
including tag names, attribute names, and IDs are case insensitive.  But
in DocBook XML, everything is case sensitive.  So in order to make
things compatible for a conversion, we had to consolidate some variant
spellings that have accumulated in our sources.  For simplicity, I have
converted everything to lower case.

The other purpose is that the DocBook XSL and DSSSL stylesheets use the
IDs for creating anchors in HTML documents (and also for the HTML file
names themselves).  This is merely a useful choice of those stylesheets.

In PG 9.6 and earlier, we used a straight SGML toolchain, using Jade and
DSSSL.  The internal representation of a DocBook SGML document after
parsing converts all the case insensitive bits to upper case.  (This
might be configured somewhere; I'm not sure.)  So the stylesheets see
all the IDs as upper case to begin with, and that's why all the anchors
come out in upper case in the HTML output.

In PG 10, the build first converts the SGML sources to XML, redeclares
them as DocBook XML, then builds using XSLT.  Because DocBook XML
requires lower-case tags and attribute names, we have to use the osx -x
lower option to convert all the case-insensitive bits to lower case
instead of the default upper case.  That's why the XSLT stylesheets see
the IDs as lower case and that's why they are like that in the output.
(If there were options more detailed than -x lower, that could have been
useful.)

The proposed patch works much later in the build process and converts
IDs to upper case only when they are being considered for making an HTML
anchor.  The structure of the document as far as the XML parser is
concerned stays the same.

For PG 11, the idea is to convert the sources to a pure XML document.
XML is case insensitive, so the XML parser would see the IDs as what
they are.  Without the mentioned patch to convert all IDs to lower case
in the source, the XSL processor would see the IDs in whatever case they
are, and anchors would end up in the HTML output using whatever case
they are.  So the conversion to lower case in the source also ensured
anchor compatibility to PG 10.  Otherwise, someone might well have
complained in a similar manner a year from now.

Applying the proposed patch to master/PG 11 would have the same effect
as in PG 10.  It would convert anchors to upper case in the HTML output
but leave the logical and physical structure of the XML document alone.

So the options are simply

1) Use the patch and keep indefinitely, keeping anchors compatible back
to forever and forward indefinitely.

2) Don't use the patch, breaking anchors from <=9.6, but keeping them
compatible going forward.

Considering how small the patch is compared to some other customizations
we carry, #1 seems reasonable to me.  I just didn't know to what extent
people had actually bookmarked fragment links.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Postgres 10 manual breaks links with anchors

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 10/26/17 16:10, Tom Lane wrote:
>> In view of commit 1ff01b390, aren't we more or less locked into
>> lower-case anchors going forward?

> The details are more complicated. ...

Ah.  I'd imagined that we were using the original case for the anchors,
rather than smashing them to upper (or lower) case.

> So the options are simply
> 1) Use the patch and keep indefinitely, keeping anchors compatible back
> to forever and forward indefinitely.
> 2) Don't use the patch, breaking anchors from <=9.6, but keeping them
> compatible going forward.
>
> Considering how small the patch is compared to some other customizations
> we carry, #1 seems reasonable to me.

+1
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers