Re: 9.4 release notes - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: 9.4 release notes
Date
Msg-id 20140505132839.GA29541@momjian.us
Whole thread Raw
In response to Re: 9.4 release notes  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: 9.4 release notes
List pgsql-hackers
On Sun, May  4, 2014 at 03:49:27PM +0200, Andres Freund wrote:
> Hi,
> 
> On 2014-05-04 08:46:07 -0400, Bruce Momjian wrote:
> > Feedback expected and welcomed.  I expect to be modifying this until we
> > release 9.4 final.  I have marked items where I need help with question
> > marks.
> 
> Thanks for doing that work. Some comments inline:

Oh, I forgot to thank committers for making this job easier every year. 
The commit titles are helpful, the descriptions good, and many items are
marked as backward incompatible where appropriate.  Git hashes make
looking up commit details easy.

> +    <listitem>
> +     <para>
> +      Remove system column pg_class.reltoastidxid (Michael Paquier)
> +     </para>
> +
> +     <para>
> +      Instead use normal index access methods.
> +     </para>
> +    </listitem>
> 
> The explanation doesn't seem helpful to me. I'd just remove it as the
> full explanation seems to be to complex and not actually interesting.

OK, description removed.  I know pg_upgrade has to be modified to handle
this.  Hopefully others will understand how to adjust things.  You are
right that the full description is complex.

> 
> +   <sect3>
> +    <title>Server</title>
> +
> +     <itemizedlist>
> +
> +      <listitem>
> +       <para>
> +        Have VACUUM properly report dead but not removable rows to the statistics collector (Hari Babu)
> +       </para>
> +
> +       <para>
> +        Previously these were reported as live rows.
> +       </para>
> +      </listitem>
> +
> +      <listitem>
> +       <para>
> +        Improve SSL renegotiation handling (Álvaro Herrera)
> +       </para>
> +      </listitem>
> +
> +      <listitem>
> +       <para>
> +        During immediate shutdown, send uncatchable termination signals to child processes that have not already
shutdown(MauMau,
 
> +        Álvaro Herrera)
> +       </para>
> +
> +       <para>
> +        This reduces the likelihood of orphaned child processes after postmaster shutdown.
> +       </para>
> +      </listitem>
> +
> +      <listitem>
> +       <para>
> +        Improve randomness of the database system identifier (Tom Lane)
> +       </para>
> +      </listitem>
> +
> +      <listitem>
> +       <para>
> +        Allow background workers to be dynamically started and terminated (Robert Haas)
> +       </para>
> 
> This is much more important than the previous features imo. Also, I'd
> add the word "registered" in the above list.
> 
> +      <listitem>
> +       <para>
> +        Allow dynamic allocation of shared memory segments (Robert Haas, Amit Kapila)
> +       </para>
> +      </listitem>
> 
> Should also be earlier in the list.

OK, added "registered" and I moved this item and the one above to #2 and
#3.  Let me know if that isn't good.  I was thinking the dynamic stuff
wasn't useful until we had parallelism working, but I think I was wrong.

> 
> +      <listitem>
> +       <para>
> +        Freeze tuples when tables are written with CLUSTER or VACUUM FULL (Robert Haas, Andres Freund)
> +       </para>
> +
> +       <para>
> +        This avoids later freezing overhead.
> +       </para>
> +      </listitem>
> 
> This sentence seems a bit awkward to me. Maybe "This avoids the need to
> freeze tuples at some later point in time."

OK, I wrote, "This avoids the need to freeze the tuples in the future."
> 
> +      <listitem>
> +       <para>
> +        Improve speed of accessing many sequence values (David Rowley)
> +       </para>
> +      </listitem>
> 
> I think this description isn't accurate. Isn't it something like
> "Improve speed of accesessing many different sequences in the same backend."

Yes, better, thanks.

> +      <listitem>
> +       <para>
> +        Use memory barriers to avoid some spinlock use (Heikki Linnakangas)
> +       </para>
> +      </listitem>
> 
> This is about 1a3d104475ce01326fc00601ed66ac4d658e37e5? If so I'd put it
> as: "Reduce spinlock contention during WAL replay." or similar.

Uh, yes.  I am not sure why I fixed on the memory barrier/spinlock part.
I used your text and moved it to the replication section.

> This imo deserves to be further up this list as this is one of the
> biggest bottlenecks in HS node performance.

I didn't move it too high up as it isn't a user-visible change, but I
put it at the top of the non-visible part.  Let me know if it needs
further promotion.

> +      <listitem>
> +       <para>
> +        Add huge_pages configuration parameter to attempt to use huge translation look-aside buffer (TLB) pages on
Linux(Christian Kruse,
 
> +        Richard Poole, Abhijit Menon-Sen)
> +       </para>
> 
> I think this is too detailed - how about: "... to use huge pages ..."?

I am not sure on this as the default is "try", but I updated the wording
like your suggested:
Add huge_pages configuration parameter to enable hugetranslation look-aside buffer (TLB) pages on Linux

> +       <para>
> +        This can improve performance on large memory systems.
> +       </para>
> +      </listitem>
> 
> Should this be in the performance section?

Well, performance is listed as "General Performance". We also have index
changes that help performance too but they are under "Index".  I think
it is probably in the right place as "configuration" is more specific
than "general performance".

> +      <listitem>
> +       <para>
> +        Add configuration variable data_checksums to report whether data page checksums are enabled (Bernd Helmle)
> +       </para>
> +      </listitem>
> 
> Since this has been backpatched since it should probably be removed.

Oh, I missed that.  Removed.  When commits are _later_ backpatched, I
sometimes miss that.

> 
> +      <listitem>
> +       <para>
> +        Use the last specified recovery_target if multiple are specified (Heikki Linnakangas)
> +       </para>
> +      </listitem>
> 
> This might want an entry in the compatibility section.

Hmmm, good question.  I always assumed doing multiple recovery_targets
was just user error, but I can see people who relied on that.  Moved.

> 
> +      <listitem>
> +       <para>
> +        Add replication slots to report the WAL activity on streaming standbys (Andres Freund, Robert Haas)
> +       </para>
> +
> +       <para>
> +        Description?  Logical only?
> +       </para>
> +      </listitem>
> 
> No, it's not logical only. How about:
> 'Replication slots allow to reserve resources like WAL on the primary
> that are needed by standby servers."

OK, new wording:
       Replication slots allow preservation of resources like WAL files on the       primary that are needed by standby
servers.

> 
> +      <listitem>
> +       <para>
> +        Improve return codes from external recovery commands (Peter Eisentraut)
> +       </para>
> +      </listitem>
> 
> This doesn't seem to be very descriptive. s/improve/report/?

Oh, I missed that nuance.  Updated.

> +      <listitem>
> +       <para>
> +        Write WAL records of running transactions every 15 seconds ? (Andres Freund)
> +       </para>
> +      </listitem>
> 
> "Write WAL records about running transactions more frequently."
> 
> "This allows standby servers to startup faster and to cleanup resources
> more aggressively."

Yes, that is exactly what I needed to know!  I couldn't figure that out.
Added.

> +     </itemizedlist>
> +
> +     <sect4>
> +      <title>Logical Change-Set Encoding</title>
> 
> s/Encoding/Extraction/
> 
> This probably deserves one entry about the new feature instead of
> separate entries about individually commited parts.

I added a paragraph at the top to explain the feature.  We added a bunch
of new stuff as far as user-visible changes so I kept them listed,
particularly because it isn't clear right away that they are related to
this feature.

> REPLICA IDENTITY should probably documented separately. Other solutions
> can use it as well.

Well, as far as 9.4 it is for this feature, so putting it somewhere else
doesn't make sense to me.  If someone else uses it in a future release
we can document that too.

> 
> +      <listitem>
> +       <para>
> +        Allow security barrier views automatically updateable (Dean Rasheed)
> +       </para>
> 
> *to be?

Fixed.

> 
> +      <listitem>
> +       <para>
> +        Add functions pg_filenode_relation() and pg_relation_filepath() to do relation/relfilenode conversions
(AndresFreund)
 
> +       </para>
> 
> pg_relation_filepath() isn't new. I think this should be something like:
> "Add function pg_filenode_relation() to allow for more efficient
> filenode to relation lookup."

Done.

> 
> +       <para>
> +        These use a new pg_class index to speed lookups.
> +       </para>
> +      </listitem>
> 
> Don't think that's interesting for the release notes.

OK, removed.  Your text got the performance aspect.
> 
> +       <listitem>
> +        <para>
> +         Have \d display disabled system triggers (Bruce Momjian)
> +        </para>
> +
> +        <para>
> +         Previously if you disabled all triggers, only user triggers would show as disabled.
> +        </para>
> +       </listitem>
> 
> Maybe mention that this could lead to broken foreign key checks after an
> ALTER TABLE .. DISABLE TRIGGERS?.

That seems too complex for the release notes.  The new display was to
remind people that they had system-level triggers disabled.  I don't think we
want to get into telling people _why_ they need to be reminded.

> +   <sect3>
> +    <title>Source Code</title>
> +
> +     <itemizedlist>
> +
> +      <listitem>
> +       <para>
> +        Improve the way tuples are frozen, to preserve forensic information ((Robert Haas, Andres Freund)
> +       </para>
> +
> +       <para>
> +        Code that inspects tuple flag bits will need to be modified
> +       </para>
> +      </listitem>
> 
> Not sure if that's just a source code level change.

I am unclear as well, but I assume people viewing rows via pgstattuple
would need to adjust things.

> +      <listitem>
> +       <para>
> +        Remove SnapshotNow and HeapTupleSatisfiesNow (Robert Haas)
> +       </para>
> +
> +       <para>
> +        All existing uses have been switched to more appropriate snapshot types.
> +       </para>
> +      </listitem>
> +
> +      <listitem>
> +       <para>
> +        Use an MVCC snapshot (rather than SnapshotNow) for catalog scans (Robert Haas)
> +       </para>
> +      </listitem>
> 
> I wonder if those two shouldn't be one entry.

Agreed.  Put the "catalog scans" item in the description of the previous
item.
> 
> +      <listitem>
> +       <para>
> +        Use printf() modifier "z" to specify size_t values (Andres Freund)
> +       </para>
> +      </listitem>
> 
> s/Use/Add infrastructure to allow to use the %x printf modifier .../

Done.

> 
> +      <listitem>
> +       <para>
> +        Memory barrier changes?
> +       </para>
> +      </listitem>
> 
> Not sure what that refers to?

I think it is this.  I have no idea if it is important:
    Fix a few problems in barrier.h.    On HPPA, implement pg_memory_barrier() as pg_compiler_barrier(), which
shouldbe correct since this arch doesn't do memory access reordering,    and is anyway better than the
completely-nonfunctional-on-this-arch   dummy_spinlock code.  (But note this patch only fixes things for gcc,    not
forbuilds with HP's compiler.)    Also, fix incorrect default definition of pg_memory_barrier as a macro    requiring
anargument.    Also, fix incorrect spelling of "#elif" as "#else if" in icc code path    (spotted by pgindent).    This
doesn'tcome close to fixing all of the functional and stylistic    deficiencies in barrier.h, but at least it un-breaks
mypersonal build.    Now that we're actually using barriers in the code, this file is going    to need some serious
attention.(TomLane)[89779bf2c] 2013-07-17 18:38:20 -0400
 

I have removed the release note item on it.

> +      <listitem>
> +       <para>
> +        Remove spinlock support for unsupported platforms SINIX, Sun3, and NS32K (Robert Haas)
> +       </para>
> +      </listitem>
> 
> I'd put it as "Remove leftover code for the unsupported
> ... platforms.".

I think I worded it that way because we really _just_ remove the
spinlock stuff --- the rest of the support was gone long ago, I think. 
In fact, I think they are CPU types, not actual platforms, at least for
NS32K.

> +      <listitem>
> +       <para>
> +        Rewrite duplicate_oids Unix hell script in Perl (Andrew Dunstan)
> +       </para>
> +      </listitem>
> 
> Heh. s/hell/shell/

Yes, funny, and removed.  ;-)

> +      <listitem>
> +       <para>
> +        Remove IRIX port (Robert Haas)
> +       </para>
> +      </listitem>
> 
> Should probably be next to the unsupported platforms list.

Agreed.  Removed.

> +      <listitem>
> +       <para>
> +        Have pg_stat_statements use a flat file for query text storage, allowing higher limits (Peter Geoghegan)
> +       </para>
> +
> +       <para>
> +        Also add the ability to retrieve all pg_stat_statements information except the query text.  This allows
programsto reuse the query
 
> +        text already retrieved by referencing queryid.
> +       </para>
> +      </listitem>
> 
> This isn't an optional thing, is it?

Here is the commit:
    Keep pg_stat_statements' query texts in a file, not in shared memory.    This change allows us to eliminate the
previouslimit on stored query    length, and it makes the shared-memory hash table very much smaller,    allowing more
statementsto be tracked.  (The default value of    pg_stat_statements.max is therefore increased from 1000 to 5000.)
Intypical scenarios, the hash table can be large enough to hold all the    statements commonly issued by an
application,so that there is little    "churn" in the set of tracked statements, and thus little need to do I/O    to
thefile.
 
-->        To further reduce the need for I/O to the query-texts file, add a way
-->        to retrieve all the columns of the pg_stat_statements view except for    the query text column.  This is
probablynot of much interest for human    use but it could be exploited by programs, which will prefer using the
queryidanyway.    Ordinarily, we'd need to bump the extension version number for the latter    change.  But since we
alreadyadvanced pg_stat_statements' version number    from 1.1 to 1.2 in the 9.4 development cycle, it seems all right
tojust    redefine what 1.2 means.    Peter Geoghegan, reviewed by Pavel Stehule(Tom Lane)[f0d6f2027] 2014-01-27
15:37:54-0500
 

It says "add a way" and "probably not of much interest for human use",
so it seems optional.

Thanks for the great feedback.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



pgsql-hackers by date:

Previous
From: Michael Renner
Date:
Subject: Cascading replication and archive_command
Next
From: Bruce Momjian
Date:
Subject: Re: 9.4 release notes