Re: PG 12 draft release notes - Mailing list pgsql-hackers

From Andres Freund
Subject Re: PG 12 draft release notes
Date
Msg-id 20190520221719.pqgld3krjc2docr5@alap3.anarazel.de
Whole thread Raw
In response to PG 12 draft release notes  (Bruce Momjian <bruce@momjian.us>)
Responses Re: PG 12 draft release notes  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Re: PG 12 draft release notes  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: PG 12 draft release notes  (David Rowley <david.rowley@2ndquadrant.com>)
Re: PG 12 draft release notes  (Peter Geoghegan <pg@bowt.ie>)
Re: PG 12 draft release notes  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Re: PG 12 draft release notes  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Re: PG 12 draft release notes  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
Hi,

Note that I've added a few questions to individuals involved with
specific points. If you're in the To: list, please search for your name.


On 2019-05-11 16:33:24 -0400, Bruce Momjian wrote:
> I have posted a draft copy of the PG 12 release notes here:
>
>     http://momjian.us/pgsql_docs/release-12.html
> They are committed to git.

Thanks!

  <title>Migration to Version 12</title>

There's a number of features in the compat section that are more general
improvements with a side of incompatibility. Won't it be confusing to
e.g. have have the ryu floating point conversion speedups in the compat
section, but not in the "General Performance" section?


     <para>
      Remove the special behavior of <link
      linkend="datatype-oid">OID</link> columns (Andres Freund,
      John Naylor)
     </para>

Should we mention that tables with OIDs have to have their oids removed
before they can be upgraded?


     <para>
      Refactor <link linkend="functions-geometry">geometric
      functions</link> and operators (Emre Hasegeli)
     </para>

     <para>
      This could lead to more accurate, but slightly different, results
      from previous releases.
     </para>
    </listitem>
    <listitem>
<!--
Author: Tomas Vondra <tomas.vondra@postgresql.org>
2018-08-16 [c4c340088] Use the built-in float datatypes to implement geometric 
-->

     <para>
      Restructure <link linkend="datatype-geometric">geometric
      types</link> to handle NaN, underflow, overflow and division by
      zero more consistently (Emre Hasegeli)
     </para>
    </listitem>

    <listitem>
<!--
Author: Tomas Vondra <tomas.vondra@postgresql.org>
2018-09-26 [2e2a392de] Fix problems in handling the line data type
-->

     <para>
      Improve behavior and error reporting for the <link
      linkend="datatype-geometric">line data type</link> (Emre Hasegeli)
     </para>
    </listitem>

Is that sufficient explanation? Feels like we need to expand a bit
more. In particular, is it possible that a subset of the changes here
require reindexing?

Also, aren't three different entries a bit too much?


     <para>
      Avoid performing unnecessary rounding of <link
      linkend="datatype-float"><type>REAL</type></link> and <type>DOUBLE
      PRECISION</type> values (Andrew Gierth)
     </para>

     <para>
      This dramatically speeds up processing of floating-point
      values but causes additional trailing digits to
      potentially be displayed.  Users wishing to have output
      that is rounded to match the previous behavior can set <link
      linkend="guc-extra-float-digits"><literal>extra_float_digits=0</literal></link>,
      which is no longer the default.
     </para>
    </listitem>

Isn't it exactly the *other* way round? *Previously* we'd output
additional trailing digits. The new algorithm instead will instead have
*exactly* the required number of digits?


      <listitem>
<!--
Author: Tom Lane <tgl@sss.pgh.pa.us>
2019-02-11 [1d92a0c9f] Redesign the partition dependency mechanism.
-->

       <para>
        Improve handling of partition dependency (Tom Lane)
       </para>

       <para>
        This prevents the creation of inconsistent partition hierarchies
        in rare cases.
       </para>
      </listitem>

That seems not very informative for users?


      <listitem>
<!--
Author: Alexander Korotkov <akorotkov@postgresql.org>
2018-07-28 [d2086b08b] Reduce path length for locking leaf B-tree pages during 
Author: Peter Geoghegan <pg@bowt.ie>
2019-03-25 [f21668f32] Add "split after new tuple" nbtree optimization.
-->

       <para>
        Improve speed of btree index insertions (Peter Geoghegan,
        Alexander Korotkov)
       </para>

       <para>
        The new code improves the space-efficiency of page splits,
        reduces locking overhead, and gives better performance for
        <command>UPDATE</command>s and <command>DELETE</command>s on
        indexes with many duplicates.
       </para>
      </listitem>

      <listitem>
<!--
Author: Peter Geoghegan <pg@bowt.ie>
2019-03-20 [dd299df81] Make heap TID a tiebreaker nbtree index column.
Author: Peter Geoghegan <pg@bowt.ie>
2019-03-20 [fab250243] Consider secondary factors during nbtree splits.
-->

       <para>
        Have new btree indexes sort duplicate index entries in heap-storage
        order (Peter Geoghegan, Heikki Linnakangas)
       </para>

       <para>
        Indexes <application>pg_upgraded</application> from previous
        releases will not have this ordering.
       </para>
      </listitem>

I'm not sure that the grouping here is quite right. And the second entry
probably should have some explanation about the benefits?


      <listitem>
<!--
Author: Peter Eisentraut <peter_e@gmx.net>
2018-11-14 [1b5d797cd] Lower lock level for renaming indexes
-->

       <para>
        Reduce locking requirements for index renaming (Peter Eisentraut)
       </para>
      </listitem>

Should we specify the newly required lock level? Because it's quire
relevant for users what exactly they're now able to do concurrently in
operation?


       <para>
        Allow <link linkend="queries-with">common table expressions</link>
        (<acronym>CTE</acronym>) to be inlined in later parts of the query
        (Andreas Karlsson, Andrew Gierth, David Fetter, Tom Lane)
       </para>

       <para>
        Specifically, <acronym>CTE</acronym>s are inlined
        if they are not recursive and are referenced only
        once later in the query.  Inlining can be prevented by
        specifying <literal>MATERIALIZED</literal>, and forced by
        specifying <literal>NOT MATERIALIZED</literal>.  Previously,
        <acronym>CTE</acronym>s were never inlined and were always
        evaluated before the rest of the query.
       </para>

Hm. Is it actually correct to say that "were always evaluated before the
rest of the query."? My understanding is that that's not actually how
they behaved. Materialization for CTE scans was on-demand (i.e. when
needed by a CTE scan), and even for DML CTEs we'd only force the
underlying query to completion at the end of the query?



      <listitem>
<!--
Author: Tom Lane <tgl@sss.pgh.pa.us>
2019-02-09 [1fb57af92] Create the infrastructure for planner support functions.
-->

       <para>
        Add support for <link linkend="sql-createfunction">function
        selectivity</link> (Tom Lane)
       </para>
      </listitem>

Hm, that message doesn't seem like an accurate description of that
commit (if anything it's a391ff3c?). Given that it all requires C
hackery, perhaps we ought to move it to the source code section? And
isn't the most important part of this set of changes

commit 74dfe58a5927b22c744b29534e67bfdd203ac028
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   2019-02-11 21:26:08 -0500

    Allow extensions to generate lossy index conditions.


      <listitem>
<!--
Author: Tomas Vondra <tomas.vondra@postgresql.org>
2019-01-29 [36a1281f8] Separate per-batch and per-tuple memory contexts in COPY
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
2019-01-25 [9556aa01c] Use single-byte Boyer-Moore-Horspool search even with mu
Author: Andres Freund <andres@anarazel.de>
2019-01-26 [a9c35cf85] Change function call information to be variable length.
-->

       <para>
        Greatly reduce memory consumption of <xref linkend="sql-copy"/>
        and function calls (Andres Freund, Tomas Vondra, Tom Lane)
       </para>
      </listitem>

Grouping these three changes together makes no sense to me.

I think the first commit just ought not to be mentioned separately, it's
just a fix for a memory leak in 31f3817402, essentially a 12 only bugfix?

The second commit is about position() etc, which seems not to match that
description either?

The third is probably more appropriate to be in the source code
section. While it does speed up function calls a bit (in particular
plpgsql which is very function call heavy), it also is a breaking change
for some external code? Not sure why Tom is listed with this entry?


      <listitem>
<!--
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
2019-01-25 [9556aa01c] Use single-byte Boyer-Moore-Horspool search even with mu
-->

       <para>
        Improve search performance for multi-byte characters (Heikki
        Linnakangas)
       </para>
      </listitem>

That's the second reference to the commit. I suspect this is much better
separate, so I'd just remove it from above.


      <listitem>
<!--
Author: Stephen Frost <sfrost@snowman.net>
2019-04-02 [4d0e994ee] Add support for partial TOAST decompression
-->

       <para>
        Allow <link linkend="storage-toast"><literal>TOAST</literal></link>
        values to be minimally decompressed (Paul Ramsey)
       </para>

I'd s/minimal/partial/ - I don't think the code guarantees anything
about it being minimal? And "minimally decompressed" also is somewhat
confusing, because it sounds like it's about the compression quality
rather than only decompressing part of the data.


      <listitem>
<!--
Author: Michael Paquier <michael@paquier.xyz>
2018-08-10 [f841ceb26] Improve TRUNCATE by avoiding early lock queue
-->

       <para>
        Prevent <xref linkend="sql-truncate"/> from requesting a lock on
        tables for which it lacks permission (Michaël Paquier)
       </para>

       <para>
        This prevents unauthorized locking delays.
       </para>
      </listitem>

      <listitem>
<!--
Author: Michael Paquier <michael@paquier.xyz>
2018-08-27 [a556549d7] Improve VACUUM and ANALYZE by avoiding early lock queue
-->

       <para>
        Prevent <command>VACUUM</command> and <command>ANALYZE</command>
        from requesting a lock on tables for which it lacks permission
        (Michaël Paquier)
       </para>

       <para>
        This prevents unauthorized locking delays.
       </para>
      </listitem>


I don't think this should be in the <title><acronym>Authentication</acronym></title>
section.

Also perhaps, s/it/the user/, or "the caller"?


      <listitem>
<!--
Author: Tom Lane <tgl@sss.pgh.pa.us>
2019-03-10 [cbccac371] Reduce the default value of autovacuum_vacuum_cost_delay
-->

       <para>
        Reduce the default value of <xref
        linkend="guc-autovacuum-vacuum-cost-delay"/> to 2ms (Tom Lane)
       </para>
      </listitem>

I think this needs to explain that this can increase autovacuum's IO
throughput considerably.

      <listitem>
<!--
Author: Tom Lane <tgl@sss.pgh.pa.us>
2019-03-10 [caf626b2c] Convert [autovacuum_]vacuum_cost_delay into floating-poi
-->

       <para>
        Allow <xref linkend="guc-vacuum-cost-delay"/> to specify
        sub-millisecond delays (Tom Lane)
       </para>

       <para>
        Floating-point values can also now be specified.
       </para>
      </listitem>

And this should be merged with the previous entry?


      <listitem>
<!--
Author: Tom Lane <tgl@sss.pgh.pa.us>
2019-03-10 [caf626b2c] Convert [autovacuum_]vacuum_cost_delay into floating-poi
-->

       <para>
        Allow time-based server variables to use <link
        linkend="config-setting">micro-seconds</link> (us) (Tom Lane)
       </para>
      </listitem>

      <listitem>
<!--
Author: Tom Lane <tgl@sss.pgh.pa.us>
2019-03-11 [1a83a80a2] Allow fractional input values for integer GUCs, and impr
-->

       <para>
        Allow fractional input for integer server variables (Tom Lane)
       </para>

       <para>
        For example, <command>SET work_mem = '30.1GB'</command>.
       </para>
      </listitem>

      <listitem>
<!--
Author: Tom Lane <tgl@sss.pgh.pa.us>
2019-03-10 [caf626b2c] Convert [autovacuum_]vacuum_cost_delay into floating-poi
-->

       <para>
        Allow units to be specified for floating-point server variables
        (Tom Lane)
       </para>
      </listitem>

Can't we combine these? Seems excessively detailed in comparison to the
rest of the entries.


     <listitem>
<!--
Author: Peter Eisentraut <peter@eisentraut.org>
2019-01-11 [ff8530605] Add value 'current' for recovery_target_timeline
-->

      <para>
       Add an explicit value of <literal>current</literal> for <xref
       linkend="guc-recovery-target-time"/> (Peter Eisentraut)
      </para>
     </listitem>

Seems like this should be combined with the earlier "Cause recovery to
advance to the latest timeline by default" entry.


     <listitem>
<!--
Author: Peter Eisentraut <peter@eisentraut.org>
2019-03-30 [fc22b6623] Generated columns
-->

      <para>
       Add support for <link linkend="sql-createtable">generated
       columns</link> (Peter Eisentraut)
      </para>

      <para>
       Rather than storing a value only at row creation time, generated
       columns are also modified during updates, and can reference other
       table columns.
      </para>
     </listitem>

I find this description confusing. How about cribbing from the commit?
Roughly like

    This allows creating columns that are computed from expressions,
    including references to other columns in the same table, rather than
    having to be specified by the inserter/updater.

Think we also ought to mention that this is only stored generated
columns, given that the SQL feature also includes virtual columns?


     <listitem>
<!--
Author: Fujii Masao <fujii@postgresql.org>
2019-04-08 [119dcfad9] Add vacuum_truncate reloption.
Author: Fujii Masao <fujii@postgresql.org>
2019-05-08 [b84dbc8eb] Add TRUNCATE parameter to VACUUM.
-->

      <para>
       Add <xref linkend="sql-vacuum"/> and <command>CREATE
       TABLE</command> options to prevent <command>VACUUM</command>
       from truncating trailing empty pages (Tsunakawa Takayuki)
      </para>

      <para>
       The options are <varname>vacuum_truncate</varname> and
       <varname>toast.vacuum_truncate</varname>.  This reduces vacuum
       locking requirements.
      </para>
     </listitem>

Maybe add something like: "This can be helpful to avoid query
cancellations on standby that are not avoided by hot_standby_feedback."?


     <listitem>
<!--
Author: Robert Haas <rhaas@postgresql.org>
2019-04-04 [a96c41fee] Allow VACUUM to be run with index cleanup disabled.
-->

      <para>
       Allow vacuum to avoid index cleanup with the
       <literal>INDEX_CLEANUP</literal> option (Masahiko Sawada)
      </para>
     </listitem>

I think we ought to expand a bit more on why one would do that,
including perhaps some caveat?


     <listitem>
<!--
Author: Peter Eisentraut <peter@eisentraut.org>
2019-03-19 [590a87025] Ignore attempts to add TOAST table to shared or catalog 
-->

      <para>
       Allow modifications of system tables using <xref
       linkend="sql-altertable"/> (Peter Eisentraut)
      </para>

      <para>
       This allows modifications of <literal>reloptions</literal> and
       autovacuum settings.
      </para>
     </listitem>

I think the first paragraph is a bit dangerous. This does *not*
generally allow modifications of system tables using ALTER TABLE.


     <listitem>
<!--
Author: Tom Lane <tgl@sss.pgh.pa.us>
2019-01-30 [5f5c01459] Allow RECORD and RECORD[] to be specified in function co
-->

      <para>
       Allow <type>RECORD</type> and <type>RECORD[]</type> to be specified
       as a function <link linkend="sql-createfunction">return-value
       record</link> (Elvis Pranskevichus)
      </para>

      <para>
       DETAIL?
      </para>
     </listitem>

This description doesn't sound accurate to me. Tom?


      <listitem>
<!--
Author: Tom Lane <tgl@sss.pgh.pa.us>
2018-09-25 [5b7e03670] Avoid unnecessary precision loss for pgbench's - -rate ta
-->

       <para>
        Compute behavior based on pgbench's <option>--rate</option>
        value more precisely (Tom Lane)
       </para>
      </listitem>

"Computing behavior" sounds a bit odd. Maybe "Improve precision of
pgbench's <option>--rate</option>" option?


      <listitem>
<!--
Author: Thomas Munro <tmunro@postgresql.org>
2018-07-13 [387a5cfb9] Add pg_dump - -on-conflict-do-nothing option.
-->

       <para>
        Allow restoration of an <command>INSERT</command>-statement dump
        to skip rows which would cause conflicts (Surafel Temesgen)
       </para>

       <para>
        The <application>pg_dump</application> option is
        <option>--on-conflict-do-nothing</option>.
       </para>
      </listitem>

Hm, this doesn't seem that clear. It's not really a restoration time
option, and it sounds a bit like that in the above. How about instead saying something
like:
Allow pg_dump to emit INSERT ... ON CONFLICT DO NOTHING (Surafel).


      <listitem>
<!--
Author: Andrew Dunstan <andrew@dunslane.net>
2019-02-18 [af25bc03e] Provide an extra-float-digits setting for pg_dump / pg_d
-->

       <para>
        Allow the number of float digits to be specified
        for <application>pg_dump</application> and
        <application>pg_dumpall</application> (Andrew Dunstan)
       </para>

       <para>
        This allows the float digit output to match previous dumps.
       </para>

Hm, feels like that should be combined with the ryu compat entry?


      <para>
       Add <xref linkend="sql-create-access-method"/> command to create
       new table types (Haribabu Kommi, Andres Freund, Álvaro Herrera,
       Dimitri Dolgov)
      </para>

A few points:

1) Is this really source code, given that CREATE ACCESS METHOD TYPE
   TABLE is a DDL command, and USING (...) for CREATE TABLE etc is an
   option to DDL commands?

2) I think the description sounds a bit too much like it's about new
   forms of tables, rather than their storage. How about something
   roughly like:

   Allow different <link linkend="tableam">table access methods</> to be
   <link linkend="sql-create-access-method>created</> and <link
   linkend="sql-createtable-method">used</>. This allows to develop and
   use new ways of storing and accessing table data, optimized for
   different use-cases, without having to modify
   PostgreSQL. The existing <literal>heap</literal> access method
   remains the default.

3) This misses a large set of commits around making tableam possible, in
   particular the commits around

commit 4da597edf1bae0cf0453b5ed6fc4347b6334dfe1
Author: Andres Freund <andres@anarazel.de>
Date:   2018-11-16 16:35:11 -0800

    Make TupleTableSlots extensible, finish split of existing slot type.

   Given that those commits entail an API break relevant for extensions,
   should we have them as a separate "source code" note?

4) I think the attribution isn't quite right. For one, a few names with
   substantial work are missing (Amit Khandekar, Ashutosh Bapat,
   Alexander Korotkov), and the order doesn't quite seem right. On the
   latter part I might be somewhat petty, but I spend *many* months of
   my life on this.

   How about:
   Andres Freund, Haribabu Kommi, Alvaro Herrera, Alexander Korotkov, David Rowley, Dimitri Golgov
   if we keep 3) separate and
   Andres Freund, Haribabu Kommi, Alvaro Herrera, Ashutosh Bapat, Alexander Korotkov, Amit Khandekar, David Rowley,
DimitriGolgov
 
   otherwise?

   I think it might actually make sense to take David off this list,
   because his tableam work is essentially part of it's own entry, as
<!--
Author: Peter Eisentraut <peter_e@gmx.net>
2018-08-01 [0d5f05cde] Allow multi-inserts during COPY into a partitioned table
-->

       <para>
        Improve speed of <command>COPY</command> into partitioned tables
        (David Rowley)
       </para>

   since his copy.c portions of 86b85044e823a largely are a rewrite of
   the above commit.


     <listitem>
<!--
Author: Greg Stark <stark@mit.edu>
2018-10-09 [36e9d413a] Add "B" suffix for bytes to docs
-->

      <para>
       Document that the <literal>B</literal>/bytes units can be specified
       for <link linkend="config-setting">server variables</link>
       (Greg Stark)
      </para>
     </listitem>

Given how large changes we skip over in the release notes, I don't
really see a point in including changes like this. Feels like we'd at
the very least also have to include larger changes with typo/grammar
fixes etc?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Mark Wong
Date:
Subject: Re: Why is infinite_recurse test suddenly failing?
Next
From: Andrew Gierth
Date:
Subject: Re: PG 12 draft release notes