Thread: [doc] remove reference to pg_dump pre-8.1 switch behaviour

[doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Ian Lawrence Barwick
Date:
Hi

The pg_dump doc page [1], under the -t/--table option, contains a Note
documenting the behavioural differences introduced in PostgreSQL 8.2.

As it's been almost exactly 14 years since that note was added [2], I suggest
it can be removed entirely.

[1] https://www.postgresql.org/docs/current/app-pgdump.html
[2]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=doc/src/sgml/ref/pg_dump.sgml;h=9aa4baf84e74817a3c3e8359b2c4c8a847fda987;hp=deafd7c9a989c2cbce3979d94416a298609f5e84;hb=24e97528631e7e810ce61fc0f5fbcaca0c001c4c;hpb=77d2b1b625c7decd7a25ec865bced3b927de6d4b


Regards

Ian Barwick


--
EnterpriseDB: https://www.enterprisedb.com



Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Ian Lawrence Barwick
Date:
2020年10月6日(火) 21:13 Ian Lawrence Barwick <barwick@gmail.com>:
>
> Hi
>
> The pg_dump doc page [1], under the -t/--table option, contains a Note
> documenting the behavioural differences introduced in PostgreSQL 8.2.
>
> As it's been almost exactly 14 years since that note was added [2], I suggest
> it can be removed entirely.
>
> [1] https://www.postgresql.org/docs/current/app-pgdump.html
> [2]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=doc/src/sgml/ref/pg_dump.sgml;h=9aa4baf84e74817a3c3e8359b2c4c8a847fda987;hp=deafd7c9a989c2cbce3979d94416a298609f5e84;hb=24e97528631e7e810ce61fc0f5fbcaca0c001c4c;hpb=77d2b1b625c7decd7a25ec865bced3b927de6d4b


Oh yes, I was planning to attach an ultra-trivial patch for that too.


Regards

Ian Barwick
--
EnterpriseDB: https://www.enterprisedb.com

Attachment

Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Heikki Linnakangas
Date:
On 06/10/2020 15:15, Ian Lawrence Barwick wrote:
> 2020年10月6日(火) 21:13 Ian Lawrence Barwick <barwick@gmail.com>:
>>
>> Hi
>>
>> The pg_dump doc page [1], under the -t/--table option, contains a Note
>> documenting the behavioural differences introduced in PostgreSQL 8.2.
>>
>> As it's been almost exactly 14 years since that note was added [2], I suggest
>> it can be removed entirely.
>>
>> [1] https://www.postgresql.org/docs/current/app-pgdump.html
>> [2]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=doc/src/sgml/ref/pg_dump.sgml;h=9aa4baf84e74817a3c3e8359b2c4c8a847fda987;hp=deafd7c9a989c2cbce3979d94416a298609f5e84;hb=24e97528631e7e810ce61fc0f5fbcaca0c001c4c;hpb=77d2b1b625c7decd7a25ec865bced3b927de6d4b
> 
> 
> Oh yes, I was planning to attach an ultra-trivial patch for that too.

Applied, thanks.

- Heikki



Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Ian Lawrence Barwick
Date:
2020年10月23日(金) 17:52 Heikki Linnakangas <hlinnaka@iki.fi>:
>
> On 06/10/2020 15:15, Ian Lawrence Barwick wrote:
> > 2020年10月6日(火) 21:13 Ian Lawrence Barwick <barwick@gmail.com>:
> >>
> >> Hi
> >>
> >> The pg_dump doc page [1], under the -t/--table option, contains a Note
> >> documenting the behavioural differences introduced in PostgreSQL 8.2.
> >>
> >> As it's been almost exactly 14 years since that note was added [2], I suggest
> >> it can be removed entirely.
> >>
> >> [1] https://www.postgresql.org/docs/current/app-pgdump.html
> >> [2]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=doc/src/sgml/ref/pg_dump.sgml;h=9aa4baf84e74817a3c3e8359b2c4c8a847fda987;hp=deafd7c9a989c2cbce3979d94416a298609f5e84;hb=24e97528631e7e810ce61fc0f5fbcaca0c001c4c;hpb=77d2b1b625c7decd7a25ec865bced3b927de6d4b
> >
> >
> > Oh yes, I was planning to attach an ultra-trivial patch for that too.
>
> Applied, thanks.
>
> - Heikki

Thanks!


Regards

Ian Barwick

--
EnterpriseDB: https://www.enterprisedb.com



Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Stephen Frost
Date:
Greetings,

* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> On 06/10/2020 15:15, Ian Lawrence Barwick wrote:
> >2020年10月6日(火) 21:13 Ian Lawrence Barwick <barwick@gmail.com>:
> >>The pg_dump doc page [1], under the -t/--table option, contains a Note
> >>documenting the behavioural differences introduced in PostgreSQL 8.2.
> >>
> >>As it's been almost exactly 14 years since that note was added [2], I suggest
> >>it can be removed entirely.
> >>
> >>[1] https://www.postgresql.org/docs/current/app-pgdump.html
> >>[2]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=doc/src/sgml/ref/pg_dump.sgml;h=9aa4baf84e74817a3c3e8359b2c4c8a847fda987;hp=deafd7c9a989c2cbce3979d94416a298609f5e84;hb=24e97528631e7e810ce61fc0f5fbcaca0c001c4c;hpb=77d2b1b625c7decd7a25ec865bced3b927de6d4b
> >
> >
> >Oh yes, I was planning to attach an ultra-trivial patch for that too.
>
> Applied, thanks.

Isn't this a bit pre-mature as we still support running pg_dump against
8.0 clusters..?

Removing support for older clusters is certainly something we can
discuss but I don't know that it makes sense to just piecemeal pull
things out.  I get that this was just a documentation note, but, still,
we do support pg_dump run against 8.0 and 8.1 clusters, at least today.

Thanks,

Stephen

Attachment

Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> Isn't this a bit pre-mature as we still support running pg_dump against
> 8.0 clusters..?

The removed para was discussing the behavior of pg_dump itself.  What
server version you run it against isn't relevant.

Having said that, there are a *lot* of past-their-sell-by-date bits
of info throughout our documentation, because we don't have any sort
of policy or mechanism for getting rid of this kind of backwards
compatibility note.  Maybe we should first try to agree on a policy
for when it's okay to remove such info.

            regards, tom lane



Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Stephen Frost
Date:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Isn't this a bit pre-mature as we still support running pg_dump against
> > 8.0 clusters..?
>
> The removed para was discussing the behavior of pg_dump itself.  What
> server version you run it against isn't relevant.

Ah, alright, that makes a bit more sense then..

> Having said that, there are a *lot* of past-their-sell-by-date bits
> of info throughout our documentation, because we don't have any sort
> of policy or mechanism for getting rid of this kind of backwards
> compatibility note.  Maybe we should first try to agree on a policy
> for when it's okay to remove such info.

I would have thought the general policy would be "match what the tool
works with", so if we've got references to things about how pg_dump
works against older-than-8.0 then we should clearly remove those as
pg_dump no londer will run against versions that old.

Extending that to more general notes would probably make sense though.
That is- we'll keep anything relevant to the oldest version that pg_dump
runs against (since I'm pretty sure pg_dump's compatibility goes the
farthest back of anything we've got in core and probably always will).

We do need to decide at what point we're going to move forward pg_dump's
oldest server version support.  I had thought we would do that with each
top-level major version change (eg: support 8.0+ until we reach 11.0 or
someting), but that doesn't work since we've moved to a single integer
for major versions.  Looking at the timeline though:

2016-10-12: 64f3524e2c8deebc02808aa5ebdfa17859473add Removed pre-8.0
2005-01-19: 8.0 released

So, that's about 10 years.

2010-09-20: 9.0 released

Or about 10 years from today, which seems to me to imply we should
probably be considering moving pg_dump forward already.  I'm not really
inclined to do this every year as I don't really think it's helpful, but
once every 5 years or so probably makes sense.  To be a bit more
specific about my thoughts:

- Move pg_dump up to 9.0 as the required minimum, starting with v14.
- In about 5 years or so, move pg_dump up to minimum of v10.

(clean up all documentation with older references and such too)

If we wanted to be particularly cute about it, we could wait until v15
to drop support for older-than-9.0, and then v20 would remove support
for older-than-10, and then v25 would remove support for
older-than-v15, etc.

Thanks,

Stephen

Attachment

Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Ian Lawrence Barwick
Date:
2020年10月23日(金) 23:12 Stephen Frost <sfrost@snowman.net>:
>
> Greetings,
>
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> > Stephen Frost <sfrost@snowman.net> writes:
> > > Isn't this a bit pre-mature as we still support running pg_dump against
> > > 8.0 clusters..?
> >
> > The removed para was discussing the behavior of pg_dump itself.  What
> > server version you run it against isn't relevant.
>
> Ah, alright, that makes a bit more sense then..

Yes, it's removing a note regarding a behavioural change between pg_dump
introduced in 8.2. This will severely inconvenience anyone who has emerged
from a coma they fell into before December 2006 and who is just getting to grips
with the brave new world of post-8.1 pg_dump, but anyone running pg_dump
against an 8.x server has hopefully caught up with the change sometime
during the last 14 years.

> > Having said that, there are a *lot* of past-their-sell-by-date bits
> > of info throughout our documentation, because we don't have any sort
> > of policy or mechanism for getting rid of this kind of backwards
> > compatibility note.  Maybe we should first try to agree on a policy
> > for when it's okay to remove such info.
>
> I would have thought the general policy would be "match what the tool
> works with", so if we've got references to things about how pg_dump
> works against older-than-8.0 then we should clearly remove those as
> pg_dump no londer will run against versions that old.
>
> Extending that to more general notes would probably make sense though.
> That is- we'll keep anything relevant to the oldest version that pg_dump
> runs against (since I'm pretty sure pg_dump's compatibility goes the
> farthest back of anything we've got in core and probably always will).

Obviously any references to supporting functionality which is no longer
actually supported should be updated/removed. Any notes about behavioural
differences between two versions no longer under community support (such as
the bit removed by this patch) seems like fair game (though I'm sure there are
exceptions). However I'm not sure what else there is out there which needs
consideration.

> We do need to decide at what point we're going to move forward pg_dump's
> oldest server version support.  (...)

I suggest starting a new thread for that.


Regards

Ian Barwick



--
EnterpriseDB: https://www.enterprisedb.com



Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> We do need to decide at what point we're going to move forward pg_dump's
> oldest server version support.

I'm not really in a big hurry to move it forward at all.  There were
good solid reasons to drop support for pre-schema and pre-pg_depend
servers, because of the messy kluges pg_dump had to implement
to provide only-partial workarounds for those lacks.  But I don't
see comparable reasons or code savings that we'll get from dropping
later versions.

There is an argument for dropping support for server versions that
fail to build anymore with modern toolchains, since once that happens
it becomes difficult to test, unless you have old executables already
laying around.  But I don't think we're at that point yet for 8.0 or
later.  (I rebuilt 7.4 and later when I updated my workstation to
RHEL8 a few months ago, and they seem fine, though I did use -O0 out of
fear of -faggressive-loop-optimizations bugs for anything before 8.2.)

But anyway, this was about documentation not code.  What I'm wondering
about is when to drop things like, say, this bit in the regex docs:

    Two significant incompatibilities exist between AREs and the ERE syntax
    recognized by pre-7.4 releases of <productname>PostgreSQL</productname>:
    (etc etc)

Seems like we could have gotten rid of that by now, but when exactly
does it become fair game?  And can we have a non-ad-hoc process for
getting rid of such cruft?

            regards, tom lane



Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Stephen Frost
Date:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > We do need to decide at what point we're going to move forward pg_dump's
> > oldest server version support.
>
> I'm not really in a big hurry to move it forward at all.  There were
> good solid reasons to drop support for pre-schema and pre-pg_depend
> servers, because of the messy kluges pg_dump had to implement
> to provide only-partial workarounds for those lacks.  But I don't
> see comparable reasons or code savings that we'll get from dropping
> later versions.
>
> There is an argument for dropping support for server versions that
> fail to build anymore with modern toolchains, since once that happens
> it becomes difficult to test, unless you have old executables already
> laying around.  But I don't think we're at that point yet for 8.0 or
> later.  (I rebuilt 7.4 and later when I updated my workstation to
> RHEL8 a few months ago, and they seem fine, though I did use -O0 out of
> fear of -faggressive-loop-optimizations bugs for anything before 8.2.)

Along those same lines though- keeping all of the versions working with
pg_dump requires everyone who is working with pg_dump to have those old
versions not just able to compile but to also take the time to test
against those older versions when making changes.

> But anyway, this was about documentation not code.

Perhaps it didn't come across very well, but I was making an argument
that we should consider them both under a general "every 5 years, go
through and clean out anything that's older than 10 years" type of
policy.  I don't know that we need to spend time doing it every year,
but I wouldn't be against it either.

> What I'm wondering
> about is when to drop things like, say, this bit in the regex docs:
>
>     Two significant incompatibilities exist between AREs and the ERE syntax
>     recognized by pre-7.4 releases of <productname>PostgreSQL</productname>:
>     (etc etc)
>
> Seems like we could have gotten rid of that by now, but when exactly
> does it become fair game?  And can we have a non-ad-hoc process for
> getting rid of such cruft?

I agree we should get rid of it and I'm suggesting our policy be that we
only go back about 10 years.  As for the process part, I suggested that
we make it a every-5-year thing, but we could make it be part of the
annual process instead.

We have a number of general tasks that go into each major release and
some of that process is documented, though it seems like a lot isn't as
explicitly spelled out as perhaps it should be.  Here I'm thinking about
things like:

- Get a CFM for each commitfest
- Form an RMT for each major release
- Figure out who will run each major/minor release
- Get translations done
- Review contributors to see who might become a committer
- other things, I'm sure

"Clean up documentation and remove things older than 10 years" could be
another item to get checked off each year.  We might consider looking at
Debian-

https://wiki.debian.org/Teams/ReleaseTeam

and

https://wiki.debian.org/Teams/ReleaseTeam/ReleaseCheckList

Perhaps the past RMTs have thought about this also.  Having these things
written down and available would be good though, and then we should make
sure that they're assigned out and get addressed (maybe that becomes
part of what the RMT does, maybe not).

Thanks,

Stephen

Attachment

Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Heikki Linnakangas
Date:
On 23/10/2020 17:51, Tom Lane wrote:
> But anyway, this was about documentation not code.  What I'm wondering
> about is when to drop things like, say, this bit in the regex docs:
> 
>      Two significant incompatibilities exist between AREs and the ERE syntax
>      recognized by pre-7.4 releases of <productname>PostgreSQL</productname>:
>      (etc etc)
> 
> Seems like we could have gotten rid of that by now, but when exactly
> does it become fair game?  And can we have a non-ad-hoc process for
> getting rid of such cruft?

Let's try to zoom in on a rule:

Anything that talks about 9.4 or above (min supported version - 1) 
should definitely be left in place.

Something around 9.0 is possibly still useful to someone upgrading or 
updating an application. Or someone might still bump into old blog posts 
from that era.

Before that, I don't see much value. Although you could argue that I 
jumped the gun on the notice about pre-8.2 pg_dump -t behavior. pg_dump 
still supports servers down to 8.0, so someone might also have an 8.0 
pg_dump binary lying around, and might be confused that -t behaves 
differently. On the whole though, I think removing it was fair game.

I did some grepping for strings like "version 7", "pre-8" and so on. I 
couldn't come up with a clear rule on what could be removed. Context 
matters. In text that talks about protocol versions or libpq functions 
like PQlibVersion() it seems sensible to go back as far as possible, for 
the completeness. And subtle user-visible differences in behavior are 
more important to document than changes in internal C APIs that cause a 
compiler failure, for example.

Other notices are about old syntax that's kept for backwards 
compatibility, but still works. It makes sense to mention the old 
version in those cases, even if it's very old, because the alternative 
would be to just say something like "very old version", which is not any 
shorter, just less precise.


Findings in detail follow. And attached is a patch about the stuff that 
I think can be removed pretty straightforwardly.

array.sgml:
   <para>
    If the value written for an element is <literal>NULL</literal> (in 
any case
    variant), the element is taken to be NULL.  The presence of any quotes
    or backslashes disables this and allows the literal string value
    <quote>NULL</quote> to be entered.  Also, for backward compatibility 
with
    pre-8.2 versions of <productname>PostgreSQL</productname>, the <xref
    linkend="guc-array-nulls"/> configuration parameter can be turned
    <literal>off</literal> to suppress recognition of 
<literal>NULL</literal> as a NULL.
   </para>

The GUC still exists, so we should keep this.

catalogs.sgml:
   <para>
    The view <structname>pg_group</structname> exists for backwards
    compatibility: it emulates a catalog that existed in
    <productname>PostgreSQL</productname> before version 8.1.
    It shows the names and members of all roles that are marked as not
    <structfield>rolcanlogin</structfield>, which is an approximation to 
the set
    of roles that are being used as groups.
   </para>

pg_group still exists, and that paragraph explains why. We should keep 
it. (There's a similar paragraph for pg_shadow)

config.sgml (on synchronized_scans):

        <para>
         This allows sequential scans of large tables to synchronize 
with each
         other, so that concurrent scans read the same block at about the
         same time and hence share the I/O workload.  When this is enabled,
         a scan might start in the middle of the table and then <quote>wrap
         around</quote> the end to cover all rows, so as to synchronize 
with the
         activity of scans already in progress.  This can result in
         unpredictable changes in the row ordering returned by queries that
         have no <literal>ORDER BY</literal> clause.  Setting this 
parameter to
         <literal>off</literal> ensures the pre-8.3 behavior in which a 
sequential
         scan always starts from the beginning of the table.  The default
         is <literal>on</literal>.
        </para>

We could remove the reference to 8.3 version. I'm inclined to keep it 
though.

func.sgml (String Functions and Operators):
     <note>
     <para>
      Before <productname>PostgreSQL</productname> 8.3, these functions 
would
      silently accept values of several non-string data types as well, 
due to
      the presence of implicit coercions from those data types to
      <type>text</type>.  Those coercions have been removed because they 
frequently
      caused surprising behaviors.  However, the string concatenation 
operator
      (<literal>||</literal>) still accepts non-string input, so long as 
at least one
      input is of a string type, as shown in <xref
      linkend="functions-string-sql"/>.  For other cases, insert an explicit
      coercion to <type>text</type> if you need to duplicate the 
previous behavior.
     </para>
    </note>

Could remove the reference to 8.3, but the information about || still 
makes sense. I'm inclined to just keep it.

func.sgml:
    <note>
      <para>
      Before <productname>PostgreSQL</productname> 8.2, the containment
      operators <literal>@></literal> and <literal><@</literal> 
were respectively
      called <literal>~</literal> and <literal>@</literal>.  These names 
are still
      available, but are deprecated and will eventually be removed.
     </para>
    </note>

The old names are still available, so should keep this.

func.sgml:
    <para>
     Before <productname>PostgreSQL</productname> 8.1, the arguments of the
     sequence functions were of type <type>text</type>, not 
<type>regclass</type>, and
     the above-described conversion from a text string to an OID value would
     happen at run time during each call.  For backward compatibility, this
     facility still exists, but internally it is now handled as an implicit
     coercion from <type>text</type> to <type>regclass</type> before the 
function is
     invoked.
    </para>

Let's remove this.

func.sqml:
   <para>
    <xref linkend="array-operators-table"/> shows the specialized operators
    available for array types.
    In addition to those, the usual comparison operators shown in <xref
    linkend="functions-comparison-op-table"/> are available for
    arrays.  The comparison operators compare the array contents
    element-by-element, using the default B-tree comparison function for
    the element data type, and sort based on the first difference.
    In multidimensional arrays the elements are visited in row-major order
    (last subscript varies most rapidly).
    If the contents of two arrays are equal but the dimensionality is
    different, the first difference in the dimensionality information
    determines the sort order.  (This is a change from versions of
    <productname>PostgreSQL</productname> prior to 8.2: older versions 
would claim
    that two arrays with the same contents were equal, even if the
    number of dimensions or subscript ranges were different.)
   </para>

Could remove it.

    <note>
      <para>
      There are two differences in the behavior of 
<function>string_to_array</function>
      from pre-9.1 versions of <productname>PostgreSQL</productname>.
      First, it will return an empty (zero-element) array rather
      than <literal>NULL</literal> when the input string is of zero length.
      Second, if the delimiter string is <literal>NULL</literal>, the 
function
      splits the input into individual characters, rather than
      returning <literal>NULL</literal> as before.
     </para>
    </note>

Feels too early to remove.

   <note>
    <para>
     Prior to <productname>PostgreSQL</productname> 8.2, the
     <literal><</literal>, <literal><=</literal>, 
<literal>></literal> and <literal>>=</literal>
     cases were not handled per SQL specification.  A comparison like
     <literal>ROW(a,b) < ROW(c,d)</literal>
     was implemented as
     <literal>a < c AND b < d</literal>
     whereas the correct behavior is equivalent to
     <literal>a < c OR (a = c AND b < d)</literal>.
    </para>
   </note>

Important incompatibility. Although very old. I'm inclined to keep it. 
If we remove it, it'd still be useful to explain the new behavior.

gin.sqml:
<title>GIN Tips and Tricks</title>

  <variablelist>
   <varlistentry>
    <term>Create vs. insert</term>
    <listitem>
     <para>
      Insertion into a <acronym>GIN</acronym> index can be slow
      due to the likelihood of many keys being inserted for each item.
      So, for bulk insertions into a table it is advisable to drop the GIN
      index and recreate it after finishing bulk insertion.
     </para>

     <para>
      As of <productname>PostgreSQL</productname> 8.4, this advice is less
      necessary since delayed indexing is used (see <xref
      linkend="gin-fast-update"/> for details).  But for very large updates
      it may still be best to drop and recreate the index.
     </para>
    </listitem>
   </varlistentry>

I think that's old enough, but the paragraph would need some 
copy-editing, not just removal.

high-availability.sgml (Record-based log shipping)
   <sect2 id="warm-standby-record">
    <title>Record-Based Log Shipping</title>

    <para>
     It is also possible to implement record-based log shipping using this
     alternative method, though this requires custom development, and 
changes
     will still only become visible to hot standby queries after a full WAL
     file has been shipped.
    </para>

    <para>
     An external program can call the 
<function>pg_walfile_name_offset()</function>
     function (see <xref linkend="functions-admin"/>)
     to find out the file name and the exact byte offset within it of
     the current end of WAL.  It can then access the WAL file directly
     and copy the data from the last known end of WAL through the 
current end
     over to the standby servers.  With this approach, the window for data
     loss is the polling cycle time of the copying program, which can be 
very
     small, and there is no wasted bandwidth from forcing partially-used
     segment files to be archived.  Note that the standby servers'
     <varname>restore_command</varname> scripts can only deal with whole 
WAL files,
     so the incrementally copied data is not ordinarily made available to
     the standby servers.  It is of use only when the primary dies —
     then the last partial WAL file is fed to the standby before allowing
     it to come up.  The correct implementation of this process requires
     cooperation of the <varname>restore_command</varname> script with 
the data
     copying program.
    </para>

    <para>
     Starting with <productname>PostgreSQL</productname> version 9.0, 
you can use
     streaming replication (see <xref linkend="streaming-replication"/>) to
     achieve the same benefits with less effort.
    </para>
   </sect2>

I think we should remove this whole section. Writing your own 
record-level log shipping by polling pg_walfile_name_offset() is 
malpractice on modern versions, when you could use streaming replication 
instead. The whole "Alternative Method for Log Shipping" section is 
pretty outdated.

indexam.sgml:
   <para>
    As of <productname>PostgreSQL</productname> 8.4,
    <function>amvacuumcleanup</function> will also be called at 
completion of an
    <command>ANALYZE</command> operation.  In this case 
<literal>stats</literal> is always
    NULL and any return value will be ignored.  This case can be 
distinguished
    by checking <literal>info->analyze_only</literal>.  It is recommended
    that the access method do nothing except post-insert cleanup in such a
    call, and that only in an autovacuum worker process.
   </para>

Let's remove the "As of PostgreSQL 8.4".

    <para>
     The standard installation provides all the header files needed for 
client
     application development as well as for server-side program
     development, such as custom functions or data types written in C.
     (Prior to <productname>PostgreSQL</productname> 8.0, a separate 
<literal>make
     install-all-headers</literal> command was needed for the latter, 
but this
     step has been folded into the standard install.)
    </para>

Remove.

      <listitem>
       <para>
        Interrogates the frontend/backend protocol being used.
<synopsis>
int PQprotocolVersion(const PGconn *conn);
</synopsis>
        Applications might wish to use this function to determine 
whether certain
        features are supported.  Currently, the possible values are 2 (2.0
        protocol), 3 (3.0 protocol), or zero (connection bad).  The
        protocol version will
        not change after connection startup is complete, but it could
        theoretically change during a connection reset.  The 3.0 protocol
        will normally be used when communicating with
        <productname>PostgreSQL</productname> 7.4 or later servers; 
pre-7.4 servers
        support only protocol 2.0.  (Protocol 1.0 is obsolete and not
        supported by <application>libpq</application>.)
       </para>
      </listitem>

Talking about old versions, even very old ones, seems appropriate for a 
function like PQprotocolVersion().

libpq.sgml, on PQlibVersion():
      <note>
       <para>
        This function appeared in <productname>PostgreSQL</productname> 
version 9.1, so
        it cannot be used to detect required functionality in earlier
        versions, since calling it will create a link dependency
        on version 9.1 or later.
       </para>
      </note>

Seems appropriate to keep.

libpq.sgml:
       <para>
        <xref linkend="libpq-PQinitSSL"/> has been present since
        <productname>PostgreSQL</productname> 8.0, while <xref 
linkend="libpq-PQinitOpenSSL"/>
        was added in <productname>PostgreSQL</productname> 8.4, so <xref 
linkend="libpq-PQinitSSL"/>
        might be preferable for applications that need to work with older
        versions of <application>libpq</application>.
       </para>

Keep.

lobj.sgml:
     <para>
      <indexterm><primary>lo_creat</primary></indexterm>
      The function
<synopsis>
Oid lo_creat(PGconn *conn, int mode);
</synopsis>
      creates a new large object.
      The return value is the OID that was assigned to the new large object,
      or <symbol>InvalidOid</symbol> (zero) on failure.

      <replaceable class="parameter">mode</replaceable> is unused and
      ignored as of <productname>PostgreSQL</productname> 8.1; however, for
      backward compatibility with earlier releases it is best to
      set it to <symbol>INV_READ</symbol>, <symbol>INV_WRITE</symbol>,
      or <symbol>INV_READ</symbol> <literal>|</literal> 
<symbol>INV_WRITE</symbol>.
      (These symbolic constants are defined
      in the header file <filename>libpq/libpq-fs.h</filename>.)
     </para>

We need to say something about 'mode'. Keep.

pgfreespacemap.sgml:
   <note>
    <para>
     The interface was changed in version 8.4, to reflect the new FSM
     implementation introduced in the same version.
    </para>
   </note>

Remove.

pgstandby.sgml:
   <para>
    <application>pg_standby</application> is designed to work with
    <productname>PostgreSQL</productname> 8.2 and later.
   </para>

IMHO we should remove pg_standby altogether. Until we get around to 
that, I think we should keep that note because it gives you a hint that 
it's old :-).

pgarchivecleanup.sgml:
   <para>
    <application>pg_archivecleanup</application> is designed to work with
    <productname>PostgreSQL</productname> 8.0 and later when used as a 
standalone utility,
    or with <productname>PostgreSQL</productname> 9.0 and later when 
used as an
    archive cleanup command.
   </para>

Ditto.

planstats.sgml:
   <para>
    The examples shown below use tables in the 
<productname>PostgreSQL</productname>
    regression test database.
    The outputs shown are taken from version 8.3.
    The behavior of earlier (or later) versions might vary.

Should refresh the outputs..

plpgsql.sgml:
        <para>
         When used with a
         <literal>BEGIN</literal> block, <literal>EXIT</literal> passes
         control to the next statement after the end of the block.
         Note that a label must be used for this purpose; an unlabeled
         <literal>EXIT</literal> is never considered to match a
         <literal>BEGIN</literal> block.  (This is a change from
         pre-8.4 releases of <productname>PostgreSQL</productname>, which
         would allow an unlabeled <literal>EXIT</literal> to match
         a <literal>BEGIN</literal> block.)
        </para>

Maybe keep for a couple more years.

protocol.sgml:
  <para>
   This document describes version 3.0 of the protocol, implemented in
   <productname>PostgreSQL</productname> 7.4 and later.  For descriptions
   of the earlier protocol versions, see previous releases of the
   <productname>PostgreSQL</productname> documentation.  A single server
   can support multiple protocol versions.  The initial startup-request
   message tells the server which protocol version the client is 
attempting to
   use.  If the major version requested by the client is not supported by
   the server, the connection will be rejected (for example, this would 
occur
   if the client requested protocol version 4.0, which does not exist as of
   this writing).  If the minor version requested by the client is not
   supported by the server (e.g., the client requests version 3.1, but the
   server supports only 3.0), the server may either reject the connection or
   may respond with a NegotiateProtocolVersion message containing the 
highest
   minor protocol version which it supports.  The client may then choose 
either
   to continue with the connection using the specified protocol version or
   to abort the connection.
  </para>

Keep.

      <varlistentry>
       <term>AuthenticationSCMCredential</term>
       <listitem>
        <para>
         This response is only possible for local Unix-domain connections
         on platforms that support SCM credential messages.  The frontend
         must issue an SCM credential message and then send a single data
         byte.  (The contents of the data byte are uninteresting; it's
         only used to ensure that the server waits long enough to receive
         the credential message.)  If the credential is acceptable,
         the server responds with an
         AuthenticationOk, otherwise it responds with an ErrorResponse.
         (This message type is only issued by pre-9.1 servers.  It may
         eventually be removed from the protocol specification.)
        </para>
       </listitem>
      </varlistentry>

Keep. It's surely still referred to in client libraries.

    <para>
     Data of a particular data type might be transmitted in any of several
     different <firstterm>formats</firstterm>.  As of 
<productname>PostgreSQL</productname> 7.4
     the only supported formats are <quote>text</quote> and 
<quote>binary</quote>,
     but the protocol makes provision for future extensions.  The desired
     format for any value is specified by a <firstterm>format 
code</firstterm>.
     Clients can specify a format code for each transmitted parameter value
     and for each column of a query result.  Text has format code zero,
     binary has format code one, and all other format codes are reserved
     for future definition.
    </para>

Could replace the "as of PostgreSQL 7.4" with "Currently", but it's not 
much shorter.

        <para>
         For a <command>COPY</command> command, the tag is
         <literal>COPY <replaceable>rows</replaceable></literal> where
         <replaceable>rows</replaceable> is the number of rows copied.
         (Note: the row count appears only in
         <productname>PostgreSQL</productname> 8.2 and later.)
        </para>

I think we should keep, since we mentioned earlier that the protocol 
documentation is for 7.4 and later.

alter_opfamily.sgml and create_opclass.sgml:
   <para>
    Before <productname>PostgreSQL</productname> 8.4, the 
<literal>OPERATOR</literal>
    clause could include a <literal>RECHECK</literal> option.  This is 
no longer
    supported because whether an index operator is <quote>lossy</quote> 
is now
    determined on-the-fly at run time.  This allows efficient handling of
    cases where an operator might or might not be lossy.
   </para>

Keep, since the syntax is still supported (but ignored).

cluster.sgml:
   <para>
    The syntax
<synopsis>
CLUSTER <replaceable class="parameter">index_name</replaceable> ON 
<replaceable class="parameter">table_name</replaceable>
</synopsis>
   is also supported for compatibility with pre-8.3 
<productname>PostgreSQL</productname>
   versions.
   </para>

Keep, since the syntax is still supported.

copy.sgml:
   <para>
    The following syntax was used before 
<productname>PostgreSQL</productname>
    version 9.0 and is still supported:
...
   <para>
    The following syntax was used before 
<productname>PostgreSQL</productname>
    version 7.3 and is still supported:

Keep, since the syntax is still supported.

create_function.sgml:
    <para>
     Before <productname>PostgreSQL</productname> version 8.3, the
     <literal>SET</literal> clause was not available, and so older 
functions may
     contain rather complicated logic to save, set, and restore
     <varname>search_path</varname>.  The <literal>SET</literal> clause 
is far easier
     to use for this purpose.
    </para>

Keep, those old functions with complicated might still exist in the wild.

create_type.sgml:
   <para>
    Before <productname>PostgreSQL</productname> version 8.3, the name of
    a generated array type was always exactly the element type's name 
with one
    underscore character (<literal>_</literal>) prepended.  (Type names were
    therefore restricted in length to one less character than other names.)
    While this is still usually the case, the array type name may vary from
    this in case of maximum-length names or collisions with user type names
    that begin with underscore.  Writing code that depends on this 
convention
    is therefore deprecated.  Instead, use
    <structname>pg_type</structname>.<structfield>typarray</structfield> 
to locate the array type
    associated with a given type.
   </para>

Let's keep it. We could remove the reference to 8.3, but would still 
need to explain the behaviour, and I think it's easiest to explain 
through its history.

create_type.sgml:
   <para>
    Before <productname>PostgreSQL</productname> version 8.2, the shell-type
    creation syntax
    <literal>CREATE TYPE <replaceable>name</replaceable></literal> did 
not exist.
    The way to create a new base type was to create its input function 
first.
    In this approach, <productname>PostgreSQL</productname> will first see
    the name of the new data type as the return type of the input function.
    The shell type is implicitly created in this situation, and then it
    can be referenced in the definitions of the remaining I/O functions.
    This approach still works, but is deprecated and might be disallowed in
    some future release.  Also, to avoid accidentally cluttering
    the catalogs with shell types as a result of simple typos in function
    definitions, a shell type will only be made this way when the input
    function is written in C.
   </para>

The deprecated way still works, so keep.

grant.sgml:
    <para>
     Since <productname>PostgreSQL</productname> 8.1, the concepts of 
users and
     groups have been unified into a single kind of entity called a role.
     It is therefore no longer necessary to use the keyword 
<literal>GROUP</literal>
     to identify whether a grantee is a user or a group. 
<literal>GROUP</literal>
     is still allowed in the command, but it is a noise word.
    </para>

The GROUP keyword is still accepted, so let's keep it.

pg_config-ref.sgml:
   <para>
    The options <option>--docdir</option>, <option>--pkgincludedir</option>,
    <option>--localedir</option>, <option>--mandir</option>,
    <option>--sharedir</option>, <option>--sysconfdir</option>,
    <option>--cc</option>, <option>--cppflags</option>,
    <option>--cflags</option>, <option>--cflags_sl</option>,
    <option>--ldflags</option>, <option>--ldflags_sl</option>,
    and <option>--libs</option> were added in 
<productname>PostgreSQL</productname> 8.1.
    The option <option>--htmldir</option> was added in 
<productname>PostgreSQL</productname> 8.4.
    The option <option>--ldflags_ex</option> was added in 
<productname>PostgreSQL</productname> 9.0.
   </para>

Let's keep these. This could still be relevant if someone is maintaining 
an extension that's backwards compatible to old versions.

pg_dumpall.sgml:
      <varlistentry>
       <term><option>--lock-wait-timeout=<replaceable 
class="parameter">timeout</replaceable></option></term>
       <listitem>
        <para>
         Do not wait forever to acquire shared table locks at the 
beginning of
         the dump. Instead, fail if unable to lock a table within the 
specified
         <replaceable class="parameter">timeout</replaceable>. The 
timeout may be
         specified in any of the formats accepted by <command>SET
         statement_timeout</command>.  Allowed values vary depending on 
the server
         version you are dumping from, but an integer number of milliseconds
         is accepted by all versions since 7.3.  This option is ignored when
         dumping from a pre-7.3 server.
        </para>
       </listitem>
      </varlistentry>

pg_dump no longer supports pre-8.0 versions, so this is definitely 
obsolete. Remove.

psql-ref.sgml:
       <listitem>
       <para>
        Before <productname>PostgreSQL</productname> 8.4,
        <application>psql</application> allowed the
        first argument of a single-letter backslash command to start
        directly after the command, without intervening whitespace.
        Now, some whitespace is required.
       </para>
       </listitem>

Keep for a few more years.

psql-ref.sgml:
           <para><literal>old-ascii</literal> style uses plain 
<acronym>ASCII</acronym>
           characters, using the formatting style used
           in <productname>PostgreSQL</productname> 8.4 and earlier.
           Newlines in data are shown using a <literal>:</literal>
           symbol in place of the left-hand column separator.
           When the data is wrapped from one line
           to the next without a newline character, a <literal>;</literal>
           symbol is used in place of the left-hand column separator.
           </para>

Keep, as long as we keep the format.

    <note>
      <para>
      Before <productname>PostgreSQL</productname> 8.2, the
      <literal>.*</literal> syntax was not expanded in row constructors, so
      that writing <literal>ROW(t.*, 42)</literal> created a two-field 
row whose first
      field was another row value.  The new behavior is usually more useful.
      If you need the old behavior of nested row values, write the inner
      row value without <literal>.*</literal>, for instance
      <literal>ROW(t, 42)</literal>.
     </para>
    </note>

I'm inclined to keep this, someone might still need that behaviour, not 
necessary for backwards-compatibility but because you might want to do 
that in an application. Or rewrite without the reference to 8.2.

   <para>
    For comparison, the <productname>PostgreSQL</productname> 8.1 
documentation
    contained 10,441 unique words, a total of 335,420 words, and the most
    frequent word <quote>postgresql</quote> was mentioned 6,127 times in 655
    documents.
   </para>

    <!-- TODO we need to put a date on these numbers? -->
   <para>
    Another example — the <productname>PostgreSQL</productname> 
mailing
    list archives contained 910,989 unique words with 57,491,343 lexemes in
    461,020 messages.
   </para>

Refresh the numbers.

   <note>
    <para>
     In the SQL standard, there is a clear distinction between users and 
roles,
     and users do not automatically inherit privileges while roles do.  This
     behavior can be obtained in <productname>PostgreSQL</productname> 
by giving
     roles being used as SQL roles the <literal>INHERIT</literal> 
attribute, while
     giving roles being used as SQL users the 
<literal>NOINHERIT</literal> attribute.
     However, <productname>PostgreSQL</productname> defaults to giving 
all roles
     the <literal>INHERIT</literal> attribute, for backward 
compatibility with pre-8.1
     releases in which users always had use of permissions granted to groups
     they were members of.
    </para>
   </note>

Keep, since that's still how it behaves.

xindex.sgml:
   <note>
     <para>
     Prior to <productname>PostgreSQL</productname> 8.3, there was no 
concept
     of operator families, and so any cross-data-type operators intended 
to be
     used with an index had to be bound directly into the index's operator
     class.  While this approach still works, it is deprecated because it
     makes an index's dependencies too broad, and because the planner can
     handle cross-data-type comparisons more effectively when both data 
types
     have operators in the same operator family.
    </para>
   </note>

Keep, because the old method still works.

- Heikki

Attachment

Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Tom Lane
Date:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> On 23/10/2020 17:51, Tom Lane wrote:
>> Seems like we could have gotten rid of that by now, but when exactly
>> does it become fair game?  And can we have a non-ad-hoc process for
>> getting rid of such cruft?

> I did some grepping for strings like "version 7", "pre-8" and so on. I
> couldn't come up with a clear rule on what could be removed. Context
> matters.

Yeah, that's unsurprising.  But thanks for all the effort you put into
this review!

> Findings in detail follow. And attached is a patch about the stuff that
> I think can be removed pretty straightforwardly.

I agree with the patch, and with your other thoughts, except as noted
below.

> config.sgml (on synchronized_scans):

>          have no <literal>ORDER BY</literal> clause.  Setting this parameter to
>          <literal>off</literal> ensures the pre-8.3 behavior in which a sequential
>          scan always starts from the beginning of the table.  The default
>          is <literal>on</literal>.

> We could remove the reference to 8.3 version. I'm inclined to keep it
> though.

Maybe s/pre-8.3/simple/, or some similar adjective?

> func.sgml:
>     <note>
>       <para>
>       Before <productname>PostgreSQL</productname> 8.2, the containment
>       operators <literal>@></literal> and <literal><@</literal>
> were respectively
>       called <literal>~</literal> and <literal>@</literal>.  These names
> are still
>       available, but are deprecated and will eventually be removed.
>      </para>
>     </note>

> The old names are still available, so should keep this.

Perhaps it's time to actually remove those operators as threatened here?
That's material for a separate discussion, though.

>     If the contents of two arrays are equal but the dimensionality is
>     different, the first difference in the dimensionality information
>     determines the sort order.  (This is a change from versions of
>     <productname>PostgreSQL</productname> prior to 8.2: older versions
> would claim
>     that two arrays with the same contents were equal, even if the
>     number of dimensions or subscript ranges were different.)
>    </para>

> Could remove it.

Yeah, I'm OK with removing the parenthetical comment.

>       There are two differences in the behavior of <function>string_to_array</function>
>       from pre-9.1 versions of <productname>PostgreSQL</productname>.

> Feels too early to remove.

+1.  9.1 was in support till ~4 years ago; 8.2 EOL'd 9 years ago.
I'm not sure where to put the cutoff, but 4 years seems too little.

>    <note>
>     <para>
>      Prior to <productname>PostgreSQL</productname> 8.2, the
>      <literal><</literal>, <literal><=</literal>,
> <literal>></literal> and <literal>>=</literal>
>      cases were not handled per SQL specification.  A comparison like
>      <literal>ROW(a,b) < ROW(c,d)</literal>
>      was implemented as
>      <literal>a < c AND b < d</literal>
>      whereas the correct behavior is equivalent to
>      <literal>a < c OR (a = c AND b < d)</literal>.
>     </para>
>    </note>

> Important incompatibility. Although very old. I'm inclined to keep it.
> If we remove it, it'd still be useful to explain the new behavior.

Yeah, even if we don't care about 8.2, some of this text is useful
to clarify the behavior of row comparisons.  I haven't looked at
the surrounding material, but I'd not want to just delete this
unless it's clearly duplicative.

>       As of <productname>PostgreSQL</productname> 8.4, this advice is less
>       necessary since delayed indexing is used (see <xref
>       linkend="gin-fast-update"/> for details).  But for very large updates
>       it may still be best to drop and recreate the index.

> I think that's old enough, but the paragraph would need some
> copy-editing, not just removal.

Right, same deal, needs a bit of wordsmithing not just deletion.

>       <replaceable class="parameter">mode</replaceable> is unused and
>       ignored as of <productname>PostgreSQL</productname> 8.1; however, for
>       backward compatibility with earlier releases it is best to
>       set it to <symbol>INV_READ</symbol>, <symbol>INV_WRITE</symbol>,
>       or <symbol>INV_READ</symbol> <literal>|</literal>
> <symbol>INV_WRITE</symbol>.

> We need to say something about 'mode'. Keep.

Maybe s/as of/since/, but otherwise fine.

>      Data of a particular data type might be transmitted in any of several
>      different <firstterm>formats</firstterm>.  As of
> <productname>PostgreSQL</productname> 7.4
>      the only supported formats are <quote>text</quote> and
> <quote>binary</quote>,
>      but the protocol makes provision for future extensions.  The desired

> Could replace the "as of PostgreSQL 7.4" with "Currently", but it's not
> much shorter.

While it's not shorter, I think it's clearer in this context.  7.4
is far enough back that a reader might expect the next sentence to
offer updated info.

>     <!-- TODO we need to put a date on these numbers? -->
>    <para>
>     Another example — the <productname>PostgreSQL</productname>
> mailing
>     list archives contained 910,989 unique words with 57,491,343 lexemes in
>     461,020 messages.
>    </para>

> Refresh the numbers.

I agree with the comment: if we keep this, there should be an "as of" date
associated with the numbers.

Thanks again for slogging through that!

            regards, tom lane



Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Stephen Frost
Date:
Greetings,

* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> On 23/10/2020 17:51, Tom Lane wrote:
> >But anyway, this was about documentation not code.  What I'm wondering
> >about is when to drop things like, say, this bit in the regex docs:
> >
> >     Two significant incompatibilities exist between AREs and the ERE syntax
> >     recognized by pre-7.4 releases of <productname>PostgreSQL</productname>:
> >     (etc etc)
> >
> >Seems like we could have gotten rid of that by now, but when exactly
> >does it become fair game?  And can we have a non-ad-hoc process for
> >getting rid of such cruft?
>
> Let's try to zoom in on a rule:
>
> Anything that talks about 9.4 or above (min supported version - 1) should
> definitely be left in place.

Sure.

> Something around 9.0 is possibly still useful to someone upgrading or
> updating an application. Or someone might still bump into old blog posts
> from that era.

Right- going back ~10 years.  (I do think it'd be good to have an actual
policy rather than just "well, right now, it seems like 9.0 is about
right").

> Before that, I don't see much value. Although you could argue that I jumped
> the gun on the notice about pre-8.2 pg_dump -t behavior. pg_dump still
> supports servers down to 8.0, so someone might also have an 8.0 pg_dump
> binary lying around, and might be confused that -t behaves differently. On
> the whole though, I think removing it was fair game.

I don't really have an issue with it, to be clear.  I had been hoping we
might be able to come up with a general rule to apply across both
documentation and code (in particular, pg_dump), but that doesn't seem
to be the case.  That does mean that *some* documentation might end up
needing to keep notes from before 9.0, where that documentation is about
pg_dump and older versions.

> I did some grepping for strings like "version 7", "pre-8" and so on. I
> couldn't come up with a clear rule on what could be removed. Context
> matters. In text that talks about protocol versions or libpq functions like
> PQlibVersion() it seems sensible to go back as far as possible, for the
> completeness. And subtle user-visible differences in behavior are more
> important to document than changes in internal C APIs that cause a compiler
> failure, for example.

I agree that context matters.

> Other notices are about old syntax that's kept for backwards compatibility,
> but still works. It makes sense to mention the old version in those cases,
> even if it's very old, because the alternative would be to just say
> something like "very old version", which is not any shorter, just less
> precise.

I would argue that we shouldn't be keeping things around for backwards
compatibility, in general.  If we feel that it's a good feature to keep
then let's keep it and just document it as an alternative syntax.

> Findings in detail follow. And attached is a patch about the stuff that I
> think can be removed pretty straightforwardly.

Thanks a lot for spending the time going through all of this!

> array.sgml:
>   <para>
>    If the value written for an element is <literal>NULL</literal> (in any
> case
>    variant), the element is taken to be NULL.  The presence of any quotes
>    or backslashes disables this and allows the literal string value
>    <quote>NULL</quote> to be entered.  Also, for backward compatibility with
>    pre-8.2 versions of <productname>PostgreSQL</productname>, the <xref
>    linkend="guc-array-nulls"/> configuration parameter can be turned
>    <literal>off</literal> to suppress recognition of <literal>NULL</literal>
> as a NULL.
>   </para>
>
> The GUC still exists, so we should keep this.

I agree we should keep the documentation as long as the GUC exists- but
we should be considering getting rid fo the GUC.

> catalogs.sgml:
>   <para>
>    The view <structname>pg_group</structname> exists for backwards
>    compatibility: it emulates a catalog that existed in
>    <productname>PostgreSQL</productname> before version 8.1.
>    It shows the names and members of all roles that are marked as not
>    <structfield>rolcanlogin</structfield>, which is an approximation to the
> set
>    of roles that are being used as groups.
>   </para>
>
> pg_group still exists, and that paragraph explains why. We should keep it.
> (There's a similar paragraph for pg_shadow)

When I wrote that, many many years ago (apparently about 15, looking
back at the commit responsible), I certainly didn't expect we'd still
have them today.  These views, which have been only haphazardly
maintained and which don't really represent the current system terribly
well, need to go.  In hindsight, introducing them was a mistake in the
first place.  We support 5 major versions for a reason and people should
be updating their code as we make changes- as they have to do in lots of
other parts of the system and for other catalogs (consider the v10
changes of XLOG -> WAL).

If we keep the views we should keep the documentation, of course, but
it's long, long past time to rip out pg_user, pg_group, and pg_shadow.
Now that we have column-level privileges, I'd think we could probably
get rid of pg_roles too as not really providing much value.

> config.sgml (on synchronized_scans):
>
>        <para>
>         This allows sequential scans of large tables to synchronize with
> each
>         other, so that concurrent scans read the same block at about the
>         same time and hence share the I/O workload.  When this is enabled,
>         a scan might start in the middle of the table and then <quote>wrap
>         around</quote> the end to cover all rows, so as to synchronize with
> the
>         activity of scans already in progress.  This can result in
>         unpredictable changes in the row ordering returned by queries that
>         have no <literal>ORDER BY</literal> clause.  Setting this parameter
> to
>         <literal>off</literal> ensures the pre-8.3 behavior in which a
> sequential
>         scan always starts from the beginning of the table.  The default
>         is <literal>on</literal>.
>        </para>
>
> We could remove the reference to 8.3 version. I'm inclined to keep it
> though.

I'm fine keeping the reference, if we keep the GUC.  I'm not really
inclined to keep this GUC though, at least not for the purpose of having
it to match pre-8.3 behavior explicitly.  If there's some usefulness to
this GUC then we should document what that is.  If there isn't, then
let's remove it.

> func.sgml (String Functions and Operators):
>     <note>
>     <para>
>      Before <productname>PostgreSQL</productname> 8.3, these functions would
>      silently accept values of several non-string data types as well, due to
>      the presence of implicit coercions from those data types to
>      <type>text</type>.  Those coercions have been removed because they
> frequently
>      caused surprising behaviors.  However, the string concatenation
> operator
>      (<literal>||</literal>) still accepts non-string input, so long as at
> least one
>      input is of a string type, as shown in <xref
>      linkend="functions-string-sql"/>.  For other cases, insert an explicit
>      coercion to <type>text</type> if you need to duplicate the previous
> behavior.
>     </para>
>    </note>
>
> Could remove the reference to 8.3, but the information about || still makes
> sense. I'm inclined to just keep it.

I'd rather we rip out what the pre-8.3 behavior was as no longer
relevant and shorten this up quite a bit:

The string concatenation operator (<literal>||</literal>) will accept
non-string input, so long as at least one input is of string type, as
shown in <xref linkend="functions-string-sql"/>.  For other cases,
inserting an explicit coercion to <type>text</type> can be used to have
non-string input accepted.

We're talking about v14 for this, after all.

> func.sgml:
>    <note>
>      <para>
>      Before <productname>PostgreSQL</productname> 8.2, the containment
>      operators <literal>@></literal> and <literal><@</literal> were
> respectively
>      called <literal>~</literal> and <literal>@</literal>.  These names are
> still
>      available, but are deprecated and will eventually be removed.
>     </para>
>    </note>
>
> The old names are still available, so should keep this.

We should either remove them, or document them directly as alternative
spellings and admit to ourselves that they aren't deprecated and aren't
going to be removed.

> func.sgml:
>    <para>
>     Before <productname>PostgreSQL</productname> 8.1, the arguments of the
>     sequence functions were of type <type>text</type>, not
> <type>regclass</type>, and
>     the above-described conversion from a text string to an OID value would
>     happen at run time during each call.  For backward compatibility, this
>     facility still exists, but internally it is now handled as an implicit
>     coercion from <type>text</type> to <type>regclass</type> before the
> function is
>     invoked.
>    </para>
>
> Let's remove this.

+1

> func.sqml:
>   <para>
>    <xref linkend="array-operators-table"/> shows the specialized operators
>    available for array types.
>    In addition to those, the usual comparison operators shown in <xref
>    linkend="functions-comparison-op-table"/> are available for
>    arrays.  The comparison operators compare the array contents
>    element-by-element, using the default B-tree comparison function for
>    the element data type, and sort based on the first difference.
>    In multidimensional arrays the elements are visited in row-major order
>    (last subscript varies most rapidly).
>    If the contents of two arrays are equal but the dimensionality is
>    different, the first difference in the dimensionality information
>    determines the sort order.  (This is a change from versions of
>    <productname>PostgreSQL</productname> prior to 8.2: older versions would
> claim
>    that two arrays with the same contents were equal, even if the
>    number of dimensions or subscript ranges were different.)
>   </para>
>
> Could remove it.

+1 to remove.

>    <note>
>      <para>
>      There are two differences in the behavior of
> <function>string_to_array</function>
>      from pre-9.1 versions of <productname>PostgreSQL</productname>.
>      First, it will return an empty (zero-element) array rather
>      than <literal>NULL</literal> when the input string is of zero length.
>      Second, if the delimiter string is <literal>NULL</literal>, the
> function
>      splits the input into individual characters, rather than
>      returning <literal>NULL</literal> as before.
>     </para>
>    </note>
>
> Feels too early to remove.

+0 to remove, for my part.

>   <note>
>    <para>
>     Prior to <productname>PostgreSQL</productname> 8.2, the
>     <literal><</literal>, <literal><=</literal>,
> <literal>></literal> and <literal>>=</literal>
>     cases were not handled per SQL specification.  A comparison like
>     <literal>ROW(a,b) < ROW(c,d)</literal>
>     was implemented as
>     <literal>a < c AND b < d</literal>
>     whereas the correct behavior is equivalent to
>     <literal>a < c OR (a = c AND b < d)</literal>.
>    </para>
>   </note>
>
> Important incompatibility. Although very old. I'm inclined to keep it. If we
> remove it, it'd still be useful to explain the new behavior.

+1 to remove and replace with an explanation of the new behavior.

> gin.sqml:
> <title>GIN Tips and Tricks</title>
>
>  <variablelist>
>   <varlistentry>
>    <term>Create vs. insert</term>
>    <listitem>
>     <para>
>      Insertion into a <acronym>GIN</acronym> index can be slow
>      due to the likelihood of many keys being inserted for each item.
>      So, for bulk insertions into a table it is advisable to drop the GIN
>      index and recreate it after finishing bulk insertion.
>     </para>
>
>     <para>
>      As of <productname>PostgreSQL</productname> 8.4, this advice is less
>      necessary since delayed indexing is used (see <xref
>      linkend="gin-fast-update"/> for details).  But for very large updates
>      it may still be best to drop and recreate the index.
>     </para>
>    </listitem>
>   </varlistentry>
>
> I think that's old enough, but the paragraph would need some copy-editing,
> not just removal.

How about:

Building a <acronym>GIN</acronym> index after all of the data has been
loaded will typically be faster than creating the index and then filling
it.  There may also be cases where, for a sufficiently large update,
dropping the <acronym>GIN</acronym> index, then performing the update,
and then recreating the index will be faster than a routine update,
however, one should review the delayed indexing technique used for
<acronym>GIN</acronym> (see <xref linkend="gin-fast-update"/> for
details) and the options it provides.

> high-availability.sgml (Record-based log shipping)
>   <sect2 id="warm-standby-record">
>    <title>Record-Based Log Shipping</title>
>
>    <para>
>     It is also possible to implement record-based log shipping using this
>     alternative method, though this requires custom development, and changes
>     will still only become visible to hot standby queries after a full WAL
>     file has been shipped.
>    </para>
>
>    <para>
>     An external program can call the
> <function>pg_walfile_name_offset()</function>
>     function (see <xref linkend="functions-admin"/>)
>     to find out the file name and the exact byte offset within it of
>     the current end of WAL.  It can then access the WAL file directly
>     and copy the data from the last known end of WAL through the current end
>     over to the standby servers.  With this approach, the window for data
>     loss is the polling cycle time of the copying program, which can be very
>     small, and there is no wasted bandwidth from forcing partially-used
>     segment files to be archived.  Note that the standby servers'
>     <varname>restore_command</varname> scripts can only deal with whole WAL
> files,
>     so the incrementally copied data is not ordinarily made available to
>     the standby servers.  It is of use only when the primary dies —
>     then the last partial WAL file is fed to the standby before allowing
>     it to come up.  The correct implementation of this process requires
>     cooperation of the <varname>restore_command</varname> script with the
> data
>     copying program.
>    </para>
>
>    <para>
>     Starting with <productname>PostgreSQL</productname> version 9.0, you can
> use
>     streaming replication (see <xref linkend="streaming-replication"/>) to
>     achieve the same benefits with less effort.
>    </para>
>   </sect2>
>
> I think we should remove this whole section. Writing your own record-level
> log shipping by polling pg_walfile_name_offset() is malpractice on modern
> versions, when you could use streaming replication instead. The whole
> "Alternative Method for Log Shipping" section is pretty outdated.

+1

> indexam.sgml:
>   <para>
>    As of <productname>PostgreSQL</productname> 8.4,
>    <function>amvacuumcleanup</function> will also be called at completion of
> an
>    <command>ANALYZE</command> operation.  In this case
> <literal>stats</literal> is always
>    NULL and any return value will be ignored.  This case can be
> distinguished
>    by checking <literal>info->analyze_only</literal>.  It is recommended
>    that the access method do nothing except post-insert cleanup in such a
>    call, and that only in an autovacuum worker process.
>   </para>
>
> Let's remove the "As of PostgreSQL 8.4".

+1

>    <para>
>     The standard installation provides all the header files needed for
> client
>     application development as well as for server-side program
>     development, such as custom functions or data types written in C.
>     (Prior to <productname>PostgreSQL</productname> 8.0, a separate
> <literal>make
>     install-all-headers</literal> command was needed for the latter, but
> this
>     step has been folded into the standard install.)
>    </para>
>
> Remove.

+1

>      <listitem>
>       <para>
>        Interrogates the frontend/backend protocol being used.
> <synopsis>
> int PQprotocolVersion(const PGconn *conn);
> </synopsis>
>        Applications might wish to use this function to determine whether
> certain
>        features are supported.  Currently, the possible values are 2 (2.0
>        protocol), 3 (3.0 protocol), or zero (connection bad).  The
>        protocol version will
>        not change after connection startup is complete, but it could
>        theoretically change during a connection reset.  The 3.0 protocol
>        will normally be used when communicating with
>        <productname>PostgreSQL</productname> 7.4 or later servers; pre-7.4
> servers
>        support only protocol 2.0.  (Protocol 1.0 is obsolete and not
>        supported by <application>libpq</application>.)
>       </para>
>      </listitem>
>
> Talking about old versions, even very old ones, seems appropriate for a
> function like PQprotocolVersion().

Agreed, at least until we rip out the 2.0 protocol..

> libpq.sgml, on PQlibVersion():
>      <note>
>       <para>
>        This function appeared in <productname>PostgreSQL</productname>
> version 9.1, so
>        it cannot be used to detect required functionality in earlier
>        versions, since calling it will create a link dependency
>        on version 9.1 or later.
>       </para>
>      </note>
>
> Seems appropriate to keep.

+0 to remove, for my part.  People building against libraries should
realize that they're creating a link dependency on things they're
calling..

> libpq.sgml:
>       <para>
>        <xref linkend="libpq-PQinitSSL"/> has been present since
>        <productname>PostgreSQL</productname> 8.0, while <xref
> linkend="libpq-PQinitOpenSSL"/>
>        was added in <productname>PostgreSQL</productname> 8.4, so <xref
> linkend="libpq-PQinitSSL"/>
>        might be preferable for applications that need to work with older
>        versions of <application>libpq</application>.
>       </para>
>
> Keep.

+1 to just remove PQinitSSL.

> lobj.sgml:
>     <para>
>      <indexterm><primary>lo_creat</primary></indexterm>
>      The function
> <synopsis>
> Oid lo_creat(PGconn *conn, int mode);
> </synopsis>
>      creates a new large object.
>      The return value is the OID that was assigned to the new large object,
>      or <symbol>InvalidOid</symbol> (zero) on failure.
>
>      <replaceable class="parameter">mode</replaceable> is unused and
>      ignored as of <productname>PostgreSQL</productname> 8.1; however, for
>      backward compatibility with earlier releases it is best to
>      set it to <symbol>INV_READ</symbol>, <symbol>INV_WRITE</symbol>,
>      or <symbol>INV_READ</symbol> <literal>|</literal>
> <symbol>INV_WRITE</symbol>.
>      (These symbolic constants are defined
>      in the header file <filename>libpq/libpq-fs.h</filename>.)
>     </para>
>
> We need to say something about 'mode'. Keep.

We should drop that parameter and get rid of this.

> pgfreespacemap.sgml:
>   <note>
>    <para>
>     The interface was changed in version 8.4, to reflect the new FSM
>     implementation introduced in the same version.
>    </para>
>   </note>
>
> Remove.

+1

> pgstandby.sgml:
>   <para>
>    <application>pg_standby</application> is designed to work with
>    <productname>PostgreSQL</productname> 8.2 and later.
>   </para>
>
> IMHO we should remove pg_standby altogether. Until we get around to that, I
> think we should keep that note because it gives you a hint that it's old
> :-).

+1 to remove pg_standby

> pgarchivecleanup.sgml:
>   <para>
>    <application>pg_archivecleanup</application> is designed to work with
>    <productname>PostgreSQL</productname> 8.0 and later when used as a
> standalone utility,
>    or with <productname>PostgreSQL</productname> 9.0 and later when used as
> an
>    archive cleanup command.
>   </para>
>
> Ditto.

+1 to remove pg_archivecleanup

> planstats.sgml:
>   <para>
>    The examples shown below use tables in the
> <productname>PostgreSQL</productname>
>    regression test database.
>    The outputs shown are taken from version 8.3.
>    The behavior of earlier (or later) versions might vary.
>
> Should refresh the outputs..

+1

> plpgsql.sgml:
>        <para>
>         When used with a
>         <literal>BEGIN</literal> block, <literal>EXIT</literal> passes
>         control to the next statement after the end of the block.
>         Note that a label must be used for this purpose; an unlabeled
>         <literal>EXIT</literal> is never considered to match a
>         <literal>BEGIN</literal> block.  (This is a change from
>         pre-8.4 releases of <productname>PostgreSQL</productname>, which
>         would allow an unlabeled <literal>EXIT</literal> to match
>         a <literal>BEGIN</literal> block.)
>        </para>
>
> Maybe keep for a couple more years.

+1 to remove

> protocol.sgml:
>  <para>
>   This document describes version 3.0 of the protocol, implemented in
>   <productname>PostgreSQL</productname> 7.4 and later.  For descriptions
>   of the earlier protocol versions, see previous releases of the
>   <productname>PostgreSQL</productname> documentation.  A single server
>   can support multiple protocol versions.  The initial startup-request
>   message tells the server which protocol version the client is attempting
> to
>   use.  If the major version requested by the client is not supported by
>   the server, the connection will be rejected (for example, this would occur
>   if the client requested protocol version 4.0, which does not exist as of
>   this writing).  If the minor version requested by the client is not
>   supported by the server (e.g., the client requests version 3.1, but the
>   server supports only 3.0), the server may either reject the connection or
>   may respond with a NegotiateProtocolVersion message containing the highest
>   minor protocol version which it supports.  The client may then choose
> either
>   to continue with the connection using the specified protocol version or
>   to abort the connection.
>  </para>
>
> Keep.

+1 to keep.

>      <varlistentry>
>       <term>AuthenticationSCMCredential</term>
>       <listitem>
>        <para>
>         This response is only possible for local Unix-domain connections
>         on platforms that support SCM credential messages.  The frontend
>         must issue an SCM credential message and then send a single data
>         byte.  (The contents of the data byte are uninteresting; it's
>         only used to ensure that the server waits long enough to receive
>         the credential message.)  If the credential is acceptable,
>         the server responds with an
>         AuthenticationOk, otherwise it responds with an ErrorResponse.
>         (This message type is only issued by pre-9.1 servers.  It may
>         eventually be removed from the protocol specification.)
>        </para>
>       </listitem>
>      </varlistentry>
>
> Keep. It's surely still referred to in client libraries.

I'd rather have something like:

AuthenticationSCMCredential

Only issued by pre-9.1 servers, no longer used.  See older documentation
for details.

Or something along those lines.

>    <para>
>     Data of a particular data type might be transmitted in any of several
>     different <firstterm>formats</firstterm>.  As of
> <productname>PostgreSQL</productname> 7.4
>     the only supported formats are <quote>text</quote> and
> <quote>binary</quote>,
>     but the protocol makes provision for future extensions.  The desired
>     format for any value is specified by a <firstterm>format
> code</firstterm>.
>     Clients can specify a format code for each transmitted parameter value
>     and for each column of a query result.  Text has format code zero,
>     binary has format code one, and all other format codes are reserved
>     for future definition.
>    </para>
>
> Could replace the "as of PostgreSQL 7.4" with "Currently", but it's not much
> shorter.

Seems a bit confusing to say "several different formats" and then say
"well, really only text and binary".  How about:

Data of a particular data type may be transmitted in either
<quote>text</quote> or <quote>binary</quote> format.  The desired format
for any value is specified by a <firstterm>format code</firstterm>.
Clients can specify a format code for each transmitted parameter
value and for each column of a query result.  Text has format code zero,
binary has format code one, and all other format codes are reserved for
future definition.

(Do we also point out that not everything supports binary..?  If not,
seems like we should, but maybe that's covered)

>        <para>
>         For a <command>COPY</command> command, the tag is
>         <literal>COPY <replaceable>rows</replaceable></literal> where
>         <replaceable>rows</replaceable> is the number of rows copied.
>         (Note: the row count appears only in
>         <productname>PostgreSQL</productname> 8.2 and later.)
>        </para>
>
> I think we should keep, since we mentioned earlier that the protocol
> documentation is for 7.4 and later.

+0 to remove.  Someone working with a 8.1 or older server could look at
those docs.

> alter_opfamily.sgml and create_opclass.sgml:
>   <para>
>    Before <productname>PostgreSQL</productname> 8.4, the
> <literal>OPERATOR</literal>
>    clause could include a <literal>RECHECK</literal> option.  This is no
> longer
>    supported because whether an index operator is <quote>lossy</quote> is
> now
>    determined on-the-fly at run time.  This allows efficient handling of
>    cases where an operator might or might not be lossy.
>   </para>
>
> Keep, since the syntax is still supported (but ignored).

We should remove the syntax.  Having things that are accepted but
ignored isn't good, imv.

> cluster.sgml:
>   <para>
>    The syntax
> <synopsis>
> CLUSTER <replaceable class="parameter">index_name</replaceable> ON
> <replaceable class="parameter">table_name</replaceable>
> </synopsis>
>   is also supported for compatibility with pre-8.3
> <productname>PostgreSQL</productname>
>   versions.
>   </para>
>
> Keep, since the syntax is still supported.

We should remove the syntax, or just document it as alternative syntax.

> copy.sgml:
>   <para>
>    The following syntax was used before
> <productname>PostgreSQL</productname>
>    version 9.0 and is still supported:
> ...
>   <para>
>    The following syntax was used before
> <productname>PostgreSQL</productname>
>    version 7.3 and is still supported:
>
> Keep, since the syntax is still supported.

+0 to keep these references to when they were introduced.

> create_function.sgml:
>    <para>
>     Before <productname>PostgreSQL</productname> version 8.3, the
>     <literal>SET</literal> clause was not available, and so older functions
> may
>     contain rather complicated logic to save, set, and restore
>     <varname>search_path</varname>.  The <literal>SET</literal> clause is
> far easier
>     to use for this purpose.
>    </para>
>
> Keep, those old functions with complicated might still exist in the wild.

+1

> create_type.sgml:
>   <para>
>    Before <productname>PostgreSQL</productname> version 8.3, the name of
>    a generated array type was always exactly the element type's name with
> one
>    underscore character (<literal>_</literal>) prepended.  (Type names were
>    therefore restricted in length to one less character than other names.)
>    While this is still usually the case, the array type name may vary from
>    this in case of maximum-length names or collisions with user type names
>    that begin with underscore.  Writing code that depends on this convention
>    is therefore deprecated.  Instead, use
>    <structname>pg_type</structname>.<structfield>typarray</structfield> to
> locate the array type
>    associated with a given type.
>   </para>
>
> Let's keep it. We could remove the reference to 8.3, but would still need to
> explain the behaviour, and I think it's easiest to explain through its
> history.

+1

> create_type.sgml:
>   <para>
>    Before <productname>PostgreSQL</productname> version 8.2, the shell-type
>    creation syntax
>    <literal>CREATE TYPE <replaceable>name</replaceable></literal> did not
> exist.
>    The way to create a new base type was to create its input function first.
>    In this approach, <productname>PostgreSQL</productname> will first see
>    the name of the new data type as the return type of the input function.
>    The shell type is implicitly created in this situation, and then it
>    can be referenced in the definitions of the remaining I/O functions.
>    This approach still works, but is deprecated and might be disallowed in
>    some future release.  Also, to avoid accidentally cluttering
>    the catalogs with shell types as a result of simple typos in function
>    definitions, a shell type will only be made this way when the input
>    function is written in C.
>   </para>
>
> The deprecated way still works, so keep.

Bleh, I disagree, +1 to remove.  We shouldn't be documenting how not to
do things in modern versions, just because that's how some old version
required it to be done.

> grant.sgml:
>    <para>
>     Since <productname>PostgreSQL</productname> 8.1, the concepts of users
> and
>     groups have been unified into a single kind of entity called a role.
>     It is therefore no longer necessary to use the keyword
> <literal>GROUP</literal>
>     to identify whether a grantee is a user or a group.
> <literal>GROUP</literal>
>     is still allowed in the command, but it is a noise word.
>    </para>
>
> The GROUP keyword is still accepted, so let's keep it.

=1 to remove the GROUP keyword as being accepted here.

> pg_config-ref.sgml:
>   <para>
>    The options <option>--docdir</option>, <option>--pkgincludedir</option>,
>    <option>--localedir</option>, <option>--mandir</option>,
>    <option>--sharedir</option>, <option>--sysconfdir</option>,
>    <option>--cc</option>, <option>--cppflags</option>,
>    <option>--cflags</option>, <option>--cflags_sl</option>,
>    <option>--ldflags</option>, <option>--ldflags_sl</option>,
>    and <option>--libs</option> were added in
> <productname>PostgreSQL</productname> 8.1.
>    The option <option>--htmldir</option> was added in
> <productname>PostgreSQL</productname> 8.4.
>    The option <option>--ldflags_ex</option> was added in
> <productname>PostgreSQL</productname> 9.0.
>   </para>
>
> Let's keep these. This could still be relevant if someone is maintaining an
> extension that's backwards compatible to old versions.

+0 to keep.

> pg_dumpall.sgml:
>      <varlistentry>
>       <term><option>--lock-wait-timeout=<replaceable
> class="parameter">timeout</replaceable></option></term>
>       <listitem>
>        <para>
>         Do not wait forever to acquire shared table locks at the beginning
> of
>         the dump. Instead, fail if unable to lock a table within the
> specified
>         <replaceable class="parameter">timeout</replaceable>. The timeout
> may be
>         specified in any of the formats accepted by <command>SET
>         statement_timeout</command>.  Allowed values vary depending on the
> server
>         version you are dumping from, but an integer number of milliseconds
>         is accepted by all versions since 7.3.  This option is ignored when
>         dumping from a pre-7.3 server.
>        </para>
>       </listitem>
>      </varlistentry>
>
> pg_dump no longer supports pre-8.0 versions, so this is definitely obsolete.
> Remove.

+1

> psql-ref.sgml:
>       <listitem>
>       <para>
>        Before <productname>PostgreSQL</productname> 8.4,
>        <application>psql</application> allowed the
>        first argument of a single-letter backslash command to start
>        directly after the command, without intervening whitespace.
>        Now, some whitespace is required.
>       </para>
>       </listitem>
>
> Keep for a few more years.

+0 to remove.

> psql-ref.sgml:
>           <para><literal>old-ascii</literal> style uses plain
> <acronym>ASCII</acronym>
>           characters, using the formatting style used
>           in <productname>PostgreSQL</productname> 8.4 and earlier.
>           Newlines in data are shown using a <literal>:</literal>
>           symbol in place of the left-hand column separator.
>           When the data is wrapped from one line
>           to the next without a newline character, a <literal>;</literal>
>           symbol is used in place of the left-hand column separator.
>           </para>
>
> Keep, as long as we keep the format.

+1 to drop the format

>    <note>
>      <para>
>      Before <productname>PostgreSQL</productname> 8.2, the
>      <literal>.*</literal> syntax was not expanded in row constructors, so
>      that writing <literal>ROW(t.*, 42)</literal> created a two-field row
> whose first
>      field was another row value.  The new behavior is usually more useful.
>      If you need the old behavior of nested row values, write the inner
>      row value without <literal>.*</literal>, for instance
>      <literal>ROW(t, 42)</literal>.
>     </para>
>    </note>
>
> I'm inclined to keep this, someone might still need that behaviour, not
> necessary for backwards-compatibility but because you might want to do that
> in an application. Or rewrite without the reference to 8.2.

I'd suggest rewriting to discuss the options and move away from it being
a history lesson.

>   <para>
>    For comparison, the <productname>PostgreSQL</productname> 8.1
> documentation
>    contained 10,441 unique words, a total of 335,420 words, and the most
>    frequent word <quote>postgresql</quote> was mentioned 6,127 times in 655
>    documents.
>   </para>
>
>    <!-- TODO we need to put a date on these numbers? -->
>   <para>
>    Another example — the <productname>PostgreSQL</productname> mailing
>    list archives contained 910,989 unique words with 57,491,343 lexemes in
>    461,020 messages.
>   </para>
>
> Refresh the numbers.

+1

>   <note>
>    <para>
>     In the SQL standard, there is a clear distinction between users and
> roles,
>     and users do not automatically inherit privileges while roles do.  This
>     behavior can be obtained in <productname>PostgreSQL</productname> by
> giving
>     roles being used as SQL roles the <literal>INHERIT</literal> attribute,
> while
>     giving roles being used as SQL users the <literal>NOINHERIT</literal>
> attribute.
>     However, <productname>PostgreSQL</productname> defaults to giving all
> roles
>     the <literal>INHERIT</literal> attribute, for backward compatibility
> with pre-8.1
>     releases in which users always had use of permissions granted to groups
>     they were members of.
>    </para>
>   </note>
>
> Keep, since that's still how it behaves.

I'd just say that then:

<productname>PostgreSQL</productname> defaults to giving all roles the
<literal>INHERIT</literal> attribute, as this is generally seen as more
useful.

> xindex.sgml:
>   <note>
>     <para>
>     Prior to <productname>PostgreSQL</productname> 8.3, there was no concept
>     of operator families, and so any cross-data-type operators intended to
> be
>     used with an index had to be bound directly into the index's operator
>     class.  While this approach still works, it is deprecated because it
>     makes an index's dependencies too broad, and because the planner can
>     handle cross-data-type comparisons more effectively when both data types
>     have operators in the same operator family.
>    </para>
>   </note>
>
> Keep, because the old method still works.

+1 to remove.

(Didn't look at the actual patch, I'm sure it does what you had said
above)

Thanks,

Stephen

Attachment

Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Justin Pryzby
Date:
On Fri, Oct 23, 2020 at 11:09:26PM +0300, Heikki Linnakangas wrote:
> Findings in detail follow.

Are you working on a patch for these ?

Otherwise, since I started something similar in April, I could put something
together based on comments you've gotten here.

-- 
Justin



Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Heikki Linnakangas
Date:
On 25/10/2020 23:56, Justin Pryzby wrote:
> On Fri, Oct 23, 2020 at 11:09:26PM +0300, Heikki Linnakangas wrote:
>> Findings in detail follow.
> 
> Are you working on a patch for these ?

I pushed the patch I included in that email now, to remove the most 
clear cases. I'm not planning to do anything more right now.

> Otherwise, since I started something similar in April, I could put something
> together based on comments you've gotten here.

That'd be great, thanks!

- Heikki



Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Justin Pryzby
Date:
On Mon, Oct 26, 2020 at 09:18:00AM +0200, Heikki Linnakangas wrote:
> On 25/10/2020 23:56, Justin Pryzby wrote:
> > On Fri, Oct 23, 2020 at 11:09:26PM +0300, Heikki Linnakangas wrote:
> > > Findings in detail follow.
> > 
> > Are you working on a patch for these ?
> 
> I pushed the patch I included in that email now, to remove the most clear
> cases. I'm not planning to do anything more right now.
> 
> > Otherwise, since I started something similar in April, I could put something
> > together based on comments you've gotten here.
> 
> That'd be great, thanks!

Some docs that Stephen, Heikki, and Yaroslov propoosed to change.

-- 
Justin

Attachment

Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Michael Paquier
Date:
On Sun, Nov 29, 2020 at 01:27:48PM -0600, Justin Pryzby wrote:
>          activity of scans already in progress.  This can result in
>          unpredictable changes in the row ordering returned by queries that
>          have no <literal>ORDER BY</literal> clause.  Setting this parameter to
> -        <literal>off</literal> ensures the pre-8.3 behavior in which a sequential
> +        <literal>off</literal> ensures the simple behavior in which a sequential
>          scan always starts from the beginning of the table.  The default
>          is <literal>on</literal>.

Mentioned upthread, but I see no problems in keeping this reference
either.

>     (last subscript varies most rapidly).
>     If the contents of two arrays are equal but the dimensionality is
>     different, the first difference in the dimensionality information
> -   determines the sort order.  (This is a change from versions of
> -   <productname>PostgreSQL</productname> prior to 8.2: older versions would claim
> -   that two arrays with the same contents were equal, even if the
> -   number of dimensions or subscript ranges were different.)
> +   determines the sort order.
>    </para>

OK to remove this one.  That was +1'd three times upthread.  I guess
that it just got missed.

>       <replaceable class="parameter">mode</replaceable> is unused and
> -     ignored as of <productname>PostgreSQL</productname> 8.1; however, for
> +     ignored since <productname>PostgreSQL</productname> 8.1; however, for
>       backward compatibility with earlier releases it is best to
>       set it to <symbol>INV_READ</symbol>, <symbol>INV_WRITE</symbol>,
>       or <symbol>INV_READ</symbol> <literal>|</literal> <symbol>INV_WRITE</symbol>

Don't see a point in changing that.  I don't agree with just removing
the parameter either as that may just break stuff.

>      Data of a particular data type might be transmitted in any of several
> -    different <firstterm>formats</firstterm>.  As of <productname>PostgreSQL</productname> 7.4
> +    different <firstterm>formats</firstterm>.  Currently
>      the only supported formats are <quote>text</quote> and <quote>binary</quote>,
>      but the protocol makes provision for future extensions.  The desired
>      format for any value is specified by a <firstterm>format code</firstterm>.

Don't think there was an agreement on that.

> -  <para>
> -   The syntax
> -<synopsis>
> -CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
class="parameter">table_name</replaceable>
> -</synopsis>
> -  is also supported for compatibility with pre-8.3 <productname>PostgreSQL</productname>
> -  versions.
> -  </para>
>   </refsect1>
>
>   <refsect1>

Seems to me that this should be kept for now.

> -     Before <productname>PostgreSQL</productname> 8.3, these functions would
> -     silently accept values of several non-string data types as well, due to
> -     the presence of implicit coercions from those data types to
> -     <type>text</type>.  Those coercions have been removed because they frequently
> -     caused surprising behaviors.  However, the string concatenation operator
> -     (<literal>||</literal>) still accepts non-string input, so long as at least one
> -     input is of a string type, as shown in <xref
> -     linkend="functions-string-sql"/>.  For other cases, insert an explicit
> -     coercion to <type>text</type> if you need to duplicate the previous behavior.
> +     The string concatenation operator (<literal>||</literal>) will accept
> +     non-string input, so long as at least one input is of string type, as shown
> +     in <xref linkend="functions-string-sql"/>.  For other cases, inserting an
> +     explicit coercion to <type>text</type> can be used to have non-string input
> +     accepted.
>      </para>
>     </note>

Word-by-word what Stephen has written upthread.  Agreed that this is
an improvement.

> +     Building a <acronym>GIN</acronym> index after all of the data has been
> +     loaded will typically be faster than creating the index and then filling
> +     it.  There may also be cases where, for a sufficiently large update,
> +     dropping the <acronym>GIN</acronym> index, then performing the update,
> +     and then recreating the index will be faster than a routine update,
> +     however, one should review the delayed indexing technique used for
> +     <acronym>GIN</acronym> (see <xref linkend="gin-fast-update"/> for
> +     details) and the options it provides.

We are losing some context with this formulation, particularly for the
case of the insertion of multiple keys.  So I think that it is better
to just remove the Postgres 8.4 bit, and keep the second paragraph
mostly as-is.

> - <para>
> -  As of <productname>PostgreSQL</productname> 9.1, null key values can be
> -  included in the index.  Also, placeholder nulls are included in the index
> -  for indexed items that are null or contain no keys according to
> -  <function>extractValue</function>.  This allows searches that should find empty
> -  items to do so.
> - </para>

Let's keep that, as agreed upthread.

>   <para>
>    Multicolumn <acronym>GIN</acronym> indexes are implemented by building
>    a single B-tree over composite values (column number, key value).  The
> @@ -507,7 +499,7 @@
>     Updating a <acronym>GIN</acronym> index tends to be slow because of the
>     intrinsic nature of inverted indexes: inserting or updating one heap row
>     can cause many inserts into the index (one for each key extracted
> -   from the indexed item). As of <productname>PostgreSQL</productname> 8.4,
> +   from the indexed item).
>     <acronym>GIN</acronym> is capable of postponing much of this work by inserting
>     new tuples into a temporary, unsorted list of pending entries.
>     When the table is vacuumed or autoanalyzed, or when

Agreed to remove this reference to 8.4.

> -   this operation while the server is running. Note that in PostgreSQL 9.1
> -   and earlier you will also need to update the <structname>pg_tablespace</structname>
> -   catalog with the new locations. (If you do not, <literal>pg_dump</literal> will
> -   continue to output the old tablespace locations.)
> +   this operation while the server is running.
>    </para>

I think that this should be kept.  pg_dump is supported with 9.1.

> -  <para>
> -   Previous releases failed to preserve a lock which is upgraded by a later
> -   savepoint.  For example, this code:
> -<programlisting>
> -BEGIN;
> -SELECT * FROM mytable WHERE key = 1 FOR UPDATE;
> -SAVEPOINT s;
> -UPDATE mytable SET ... WHERE key = 1;
> -ROLLBACK TO s;
> -</programlisting>
> -   would fail to preserve the <literal>FOR UPDATE</literal> lock after the
> -   <command>ROLLBACK TO</command>.  This has been fixed in release 9.3.
> -  </para>

Feels a bit early to remove IMO.

> -   <para>
> -    Note that if a <literal>FROM</literal> clause is not specified,
> -    the query cannot reference any database tables. For example, the
> -    following query is invalid:
> -<programlisting>
> -SELECT distributors.* WHERE distributors.name = 'Westward';
> -</programlisting><productname>PostgreSQL</productname> releases prior to
> -    8.1 would accept queries of this form, and add an implicit entry
> -    to the query's <literal>FROM</literal> clause for each table
> -    referenced by the query. This is no longer allowed.
> -   </para>
>    </refsect2>

OK to remove the whole paragraph here.

> -     <para>
> -      The ability to use names to reference SQL function arguments was added
> -      in <productname>PostgreSQL</productname> 9.2.  Functions to be used in
> -      older servers must use the <literal>$<replaceable>n</replaceable></literal> notation.
> -     </para>
> -    </note>
>     </sect2>

I think that's too early to remove.

So this comes down to 5 items, as per the attached.  Thoughts?
--
Michael

Attachment

Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> So this comes down to 5 items, as per the attached.  Thoughts?

These items look fine to me, except this bit seems a bit awkward:

+     Note that the delayed indexing technique used for <acronym>GIN</acronym>
+     (see <xref linkend="gin-fast-update"/> for details) makes this advice
+     less necessary, but for very large updates it may still be best to
+     drop and recreate the index.

Less necessary than what?  Maybe instead write

      When fastupdate is enabled (see ...), the penalty is much less than
      when it is not.  But for very large updates it may still be best to
      drop and recreate the index.

            regards, tom lane



Re: [doc] remove reference to pg_dump pre-8.1 switch behaviour

From
Michael Paquier
Date:
On Mon, Nov 30, 2020 at 03:46:19PM -0500, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
> > So this comes down to 5 items, as per the attached.  Thoughts?
>
> These items look fine to me, except this bit seems a bit awkward:
>
> +     Note that the delayed indexing technique used for <acronym>GIN</acronym>
> +     (see <xref linkend="gin-fast-update"/> for details) makes this advice
> +     less necessary, but for very large updates it may still be best to
> +     drop and recreate the index.
>
> Less necessary than what?  Maybe instead write
>
>       When fastupdate is enabled (see ...), the penalty is much less than
>       when it is not.  But for very large updates it may still be best to
>       drop and recreate the index.

Thanks, that's indeed better.  I used your wording, looked at that
again, and applied that.
--
Michael

Attachment