== PostgreSQL Weekly News - March 1, 2020 == - Mailing list pgsql-announce

From David Fetter
Subject == PostgreSQL Weekly News - March 1, 2020 ==
Date
Msg-id 20200301233645.GA4835@fetter.org
Whole thread Raw
List pgsql-announce
== PostgreSQL Weekly News - March  1, 2020 ==

Postgres Ibiza will be held in Ibiza, Spain on June 25-26, 2020. The CfP is open until
March 8, 2020.
https://pgibz.io/

PGDay Ukraine will take place April 25th, 2020 in Lviv at the Bank Hotel.
https://pgday.org.ua/

== PostgreSQL Product News ==

pgBackRest 2.24, a backup and restore system for PostgreSQL, released.
https://pgbackrest.org/release.html#2.24

Database Lab 0.3.0, a tool for fast cloning of large PostgreSQL databases to
build non-production environments, released:
https://gitlab.com/postgres-ai/database-lab

Joe 0.5.0, a Slack chatbot that helps backend developers and DBAs troubleshoot
and optimize PostgreSQL queries, releaesd.
https://gitlab.com/postgres-ai/joe/-/releases#0.5.0

HighGo PostgreSQL Server 1.2, a PostgreSQL derivative with a focus on database
backup performance enhancement and maintaining high usability, released.
https://www.highgo.ca/products/

pgquarrel 0.7.0, a tool which compares PostgreSQL database schemas and outputs a
set of commands to turn a database schema into another one, released.
http://eulerto.github.io/pgquarrel

== PostgreSQL Jobs for March ==

http://archives.postgresql.org/pgsql-jobs/2020-03/

== PostgreSQL Local ==

pgDay Israel 2020 will take place on March 19, 2020 in Tel Aviv.
http://pgday.org.il/

pgDay Paris 2020 will be held in Paris, France on March 26, 2020 at Espace
Saint-Martin.
https://2020.pgday.paris/

Nordic PGDay 2020 will be held in Helsinki, Finland at the Hilton Helsinki
Strand Hotel on March 24, 2020.

PostgreSQL@SCaLE is a two day, two track event which takes place on
March 5-6, 2020, at Pasadena Convention Center, as part of SCaLE 18X.
https://www.socallinuxexpo.org/scale/18x/postgresscale

PGConfNepal 2020 will be held April 17-18, 2020 at Kathmandu University, Dhulikhel,
Nepal.
https://pgconf.org.np/

The German-speaking PostgreSQL Conference 2020 will take place on May 15, 2019
in Stuttgart.

PGCon 2020 will take place in Ottawa on May 26-29, 2020.
https://www.pgcon.org/2020/

PGDay.IT 2020 will take place June 11-12 in Bergamo, Italy.
https://2020.pgday.it/en/

Swiss PGDay 2020 will take place in Rapperswil (near Zurich) on June 18-19, 2020.
The Call for Speakers is open through March 17, 2020.
https://www.pgday.ch/2020/

PostgresLondon 2020 will be July 7-8, 2020 with an optional training day on
July 6. The CfP is open at https://forms.gle/5m8ybUt9YDZG4gVU7 through March
27, 2020.
http://postgreslondon.org

PG Day Russia will be in Saint Petersburg on July 10, 2020. The CfP is open at
https://pgday.ru/en/2020/for-speakers through April 6, 2020.
https://pgday.ru/en/2020/

FOSS4G 2020, will take place in Calgary, Alberta, Canada August 24-29 2020.
the Call for Papers is currently open at https://2020.foss4g.org/speakers/
https://2020.foss4g.org/

Austrian pgDay will take place September 18, 2020 at Schloss Schoenbrunn
(Apothekertrakt) in Vienna. The CfP is open until April 19, 2020 at
https://pgday.at/en/talk-commitee/
https://pgday.at/en/

== PostgreSQL in the News ==

Planet PostgreSQL: http://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

== Applied Patches ==

Michaël Paquier pushed:

- Add prefix checks in exclude lists for pg_rewind, pg_checksums and base
  backups. An instance of PostgreSQL crashing with a bad timing could leave
  behind temporary pg_internal.init files, potentially causing failures when
  verifying checksums.  As the same exclusion lists are used between pg_rewind,
  pg_checksums and basebackup.c, all those tools are extended with prefix checks
  to keep everything in sync, with dedicated checks added for pg_internal.init.
  Backpatch down to 11, where pg_checksums (pg_verify_checksums in 11) and
  checksum verification for base backups have been introduced.  Reported-by:
  Michael Banck Author: Michael Paquier Reviewed-by: Kyotaro Horiguchi, David
  Steele Discussion:
  https://postgr.es/m/62031974fd8e941dd8351fbc8c7eff60d59c5338.camel@credativ.de
  Backpatch-through: 11
  https://git.postgresql.org/pg/commitdiff/bf883b211eae18662f2dfaede02f5d115bf0b805

- Issue properly WAL record for CID of first catalog tuple in multi-insert.
  Multi-insert for heap is not yet used actively for catalogs, but the code to
  support this case is in place for logical decoding.  The existing code forgot
  to issue a XLOG_HEAP2_NEW_CID record for the first tuple inserted, leading to
  failures when attempting to use multiple inserts for catalogs at decoding
  time.  This commit fixes the problem by WAL-logging the needed CID.  This is
  not an active bug, so no back-patch is done.  Author: Daniel Gustafsson
  Discussion: https://postgr.es/m/E0D4CC67-A1CF-4DF4-991D-B3AC2EB5FAE9@yesql.se
  https://git.postgresql.org/pg/commitdiff/7d672b76bf27327dc3527dabcd8be4e2dedf430f

- Fix build failure on header generation with repetitive builds of MSVC.
  GenerateConfigHeader() in Solution.pm was complaining about unused define
  symbols even if a newer config header was not generated, causing successive
  build attempts with MSVC to fail.  Oversight in commit 8f4fb4c.  Author:
  Kyotaro Horiguchi Reviewed-by: Juan José Santamaría Flecha Discussion:
  https://postgr.es/m/20200218.160500.44393633318853097.horikyota.ntt@gmail.com
  https://git.postgresql.org/pg/commitdiff/59f9cd9dd5e4db8c59c57a17388c17564a3211a3

- createdb: Fix quoting of --encoding, --lc-ctype and --lc-collate. The original
  coding failed to properly quote those arguments, leading to failures when
  using quotes in the values used.  As the quoting can be encoding-sensitive,
  the connection to the backend needs to be taken before applying the correct
  quoting.  Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion:
  https://postgr.es/m/20200214041004.GB1998@paquier.xyz Backpatch-through: 9.5
  https://git.postgresql.org/pg/commitdiff/008cf040962c98c7c55d54c28dcb43c3c1d83c92

- Skip foreign tablespaces when running pg_checksums/pg_verify_checksums.
  Attempting to use pg_checksums (pg_verify_checksums in 11) on a data folder
  which includes tablespace paths used across multiple major versions would
  cause pg_checksums to scan all directories present in pg_tblspc, and not only
  marked with TABLESPACE_VERSION_DIRECTORY.  This could lead to failures when
  for example running sanity checks on an upgraded instance with --check.  Even
  worse, it was possible to rewrite on-disk pages with --enable for a cluster
  potentially online.  This commit makes pg_checksums skip any directories not
  named TABLESPACE_VERSION_DIRECTORY, similarly to what is done for base
  backups.  Reported-by: Michael Banck Author: Michael Banck, Bernd Helmle
  Discussion:
  https://postgr.es/m/62031974fd8e941dd8351fbc8c7eff60d59c5338.camel@credativ.de
  backpatch-through: 11
  https://git.postgresql.org/pg/commitdiff/428a2609ef64b69d709418a50e192ff11a2643f1

- Remove TAP test for createdb --lc-ctype. OpenBSD falls back to "C" when using
  an incorrect input with setlocale() and LC_CTYPE, causing this test,
  introduced by 008cf04, to fail.  This removes the culprit test to avoid the
  portability issue.  Per report from Robert Haas, via buildfarm member
  curculio.  Discussion:
  https://postgr.es/m/CA+TgmoZ6ddh3mHD9gU8DvNYoFmuJaYYn1+4AvZNp25vTdRwCAQ@mail.gmail.com
  Backpatch-through: 11
  https://git.postgresql.org/pg/commitdiff/c4b0edb07ed53063ea4c86cd7918ad6ea01d8979

Robert Haas pushed:

- Move bitmap_hash and bitmap_match to bitmapset.c. The closely-related function
  bms_hash_value is already defined in that file, and this change means that
  hashfn.c no longer needs to depend on nodes/bitmapset.h. That gets us closer
  to allowing use of the hash functions in hashfn.c in frontend code.  Patch by
  me, reviewed by Suraj Kharage and Mark Dilger.  Discussion:
  http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
  https://git.postgresql.org/pg/commitdiff/07b95c3d8334f737d4717c91967729f7721e785c

- Put all the prototypes for hashfn.c into the same header file. Previously,
  some of the prototypes for functions in hashfn.c were in utils/hashutils.h and
  others were in utils/hsearch.h, but that is confusing and has no particular
  benefit.  Patch by me, reviewed by Suraj Kharage and Mark Dilger.  Discussion:
  http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
  https://git.postgresql.org/pg/commitdiff/9341c783cc42ffae5860c86bdc713bd47d734ffd

- Adapt hashfn.c and hashutils.h for frontend use. hash_any() and its various
  variants are defined to return Datum, which is a backend-only concept, but the
  underlying functions actually want to return uint32 and uint64, and only
  return Datum because it's convenient for callers who are using them to
  implement a hash function for some SQL datatype.  However, changing these
  functions to return uint32 and uint64 seems like it might lead to programming
  errors or back-patching difficulties, both because they are widely used and
  because failure to use UInt{32,64}GetDatum() might not provoke a compilation
  error. Instead, rename the existing functions as well as changing the return
  type, and add static inline wrappers for those callers that need the previous
  behavior.  Although this commit adapts hashutils.h and hashfn.c so that they
  can be compiled as frontend code, it does not actually do anything that would
  cause them to be so compiled. That is left for another commit.  Patch by me,
  reviewed by Suraj Kharage and Mark Dilger.  Discussion:
  http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
  https://git.postgresql.org/pg/commitdiff/a91e2fa94180f24dd68fb6c99136cda820e02089

- Move src/backend/utils/hash/hashfn.c to src/common. This also involves
  renaming src/include/utils/hashutils.h, which becomes
  src/include/common/hashfn.h. Perhaps an argument can be made for keeping the
  hashutils.h name, but it seemed more consistent to make it match the name of
  the file, and also more descriptive of what is actually going on here.  Patch
  by me, reviewed by Suraj Kharage and Mark Dilger. Off-list advice on how not
  to break the Windows build from Davinder Singh and Amit Kapila.  Discussion:
  http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
  https://git.postgresql.org/pg/commitdiff/05d8449e73694585b59f8b03aaa087f04cc4679a

Peter Eisentraut pushed:

- Change client-side fsync_fname() to report errors fatally. Given all we have
  learned about fsync() error handling in the last few years, reporting an
  fsync() error non-fatally is not useful, unless you don't care much about the
  file, in which case you probably don't need to use fsync() in the first place.
  Change fsync_fname() and durable_rename() to exit(1) on fsync() errors other
  than those that we specifically chose to ignore.  This affects initdb,
  pg_basebackup, pg_checksums, pg_dump, pg_dumpall, and pg_rewind.  Reviewed-by:
  Michael Paquier <michael@paquier.xyz> Discussion:
  https://www.postgresql.org/message-id/flat/d239d1bd-aef0-ca7c-dc0a-da14bdcf0392%402ndquadrant.com
  https://git.postgresql.org/pg/commitdiff/1420617b14e2e3722367b826986a50ea33ff62ec

- Add PostgreSQL home page to --help output. Per emerging standard in GNU
  programs and elsewhere.  Autoconf already has support for specifying a home
  page, so we can just that.  Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
  Discussion:
  https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
  https://git.postgresql.org/pg/commitdiff/1933ae629e7b706c6c23673a381e778819db307d

- Refer to bug report address by symbol rather than hardcoding. Use the
  PACKAGE_BUGREPORT macro that is created by Autoconf for referring to the bug
  reporting address rather than hardcoding it everywhere.  This makes it easier
  to change the address and it reduces translation work.  Reviewed-by: Daniel
  Gustafsson <daniel@yesql.se> Discussion:
  https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
  https://git.postgresql.org/pg/commitdiff/864934131ef72dc3a403ad1375a94543fcc04206

Tom Lane pushed:

- Account explicitly for long-lived FDs that are allocated outside fd.c. The
  comments in fd.c have long claimed that all file allocations should go through
  that module, but in reality that's not always practical. fd.c doesn't supply
  APIs for invoking some FD-producing syscalls like pipe() or epoll_create();
  and the APIs it does supply for non-virtual FDs are mostly insistent on
  releasing those FDs at transaction end; and in some cases the actual open()
  call is in code that can't be made to use fd.c, such as libpq.  This has led
  to a situation where, in a modern server, there are likely to be seven or so
  long-lived FDs per backend process that are not known to fd.c.  Since
  NUM_RESERVED_FDS is only 10, that meant we had *very* few spare FDs if
  max_files_per_process is >= the system ulimit and fd.c had opened all the
  files it thought it safely could.  The contrib/postgres_fdw regression test,
  in particular, could easily be made to fall over by running it under a
  restrictive ulimit.  To improve matters, invent functions
  Acquire/Reserve/ReleaseExternalFD that allow outside callers to tell fd.c that
  they have or want to allocate a FD that's not directly managed by fd.c.  Add
  calls to track all the fixed FDs in a standard backend session, so that we are
  honestly guaranteeing that NUM_RESERVED_FDS FDs remain unused below the EMFILE
  limit in a backend's idle state.  The coding rules for these functions say
  that there's no need to call them in code that just allocates one FD over a
  fairly short interval; we can dip into NUM_RESERVED_FDS for such cases. That
  means that there aren't all that many places where we need to worry. But
  postgres_fdw and dblink must use this facility to account for long-lived FDs
  consumed by libpq connections.  There may be other places where it's worth
  doing such accounting, too, but this seems like enough to solve the immediate
  problem.  Internally to fd.c, "external" FDs are limited to max_safe_fds/3
  FDs. (Callers can choose to ignore this limit, but of course it's unwise to do
  so except for fixed file allocations.)  I also reduced the limit on
  "allocated" files to max_safe_fds/3 FDs (it had been max_safe_fds/2).
  Conceivably a smarter rule could be used here --- but in practice, on
  reasonable systems, max_safe_fds should be large enough that this isn't much
  of an issue, so KISS for now.  To avoid possible regression in the number of
  external or allocated files that can be opened, increase FD_MINFREE and the
  lower limit on max_files_per_process a little bit; we now insist that the
  effective "ulimit -n" be at least 64.  This seems like pretty clearly a bug
  fix, but in view of the lack of field complaints, I'll refrain from risking a
  back-patch.  Discussion:
  https://postgr.es/m/E1izCmM-0005pV-Co@gemulon.postgresql.org
  https://git.postgresql.org/pg/commitdiff/3d475515a15f70a4a3f36fbbba93db6877ff8346

- Fix compile failure. I forgot that some compilers won't handle #if constructs
  within ereport() calls.  Duplicating most of the call is annoying but simple.
  Per buildfarm.
  https://git.postgresql.org/pg/commitdiff/36390713a60f446da7e7ae758771c9104fa89394

- Suppress unnecessary RelabelType nodes in more cases. eval_const_expressions
  sometimes produced RelabelType nodes that were useless because they just
  relabeled an expression to the same exposed type it already had.  This is
  worth avoiding because it can cause two equivalent expressions to not be
  equal(), preventing recognition of useful optimizations.  In the test case
  added here, an unpatched planner fails to notice that the "sqli = constant"
  clause renders a sort step unnecessary, because one code path produces an
  extra RelabelType and another doesn't.  Fix by ensuring that
  eval_const_expressions_mutator's T_RelabelType case will not add in an
  unnecessary RelabelType.  Also save some code by sharing a subroutine with the
  effectively-equivalent cases for CollateExpr and CoerceToDomain.  (CollateExpr
  had no bug, and I think that the case couldn't arise with CoerceToDomain, but
  it seems prudent to do the same check for all three cases.)  Back-patch to
  v12.  In principle this has been wrong all along, but I haven't seen a case
  where it causes visible misbehavior before v12, so refrain from changing
  stable branches unnecessarily.  Per investigation of a report from Eric
  Gillum.  Discussion:
  https://postgr.es/m/CAMmjdmvAZsUEskHYj=KT9sTukVVCiCSoe_PBKOXsncFeAUDPCQ@mail.gmail.com
  https://git.postgresql.org/pg/commitdiff/a477bfc1dfb8d2b7125a9818edcbf383bf82d62a

- Doc: correct thinko in pg_buffercache documentation. Access to this module is
  granted to the pg_monitor role, not pg_read_all_stats.  (Given the view's
  performance impact, it seems wise to be restrictive, so I think this was the
  correct decision --- and anyway it was clearly intentional.)  Per bug #16279
  from Philip Semanchuk.  Discussion:
  https://postgr.es/m/16279-fcaac33c68aab0ab@postgresql.org
  https://git.postgresql.org/pg/commitdiff/963ea55c34cc325243fd30468ce8c629572f0c73

- Avoid failure if autovacuum tries to access a just-dropped temp namespace.
  Such an access became possible when commit 246a6c8f7 added more aggressive
  cleanup of orphaned temp relations by autovacuum. Since autovacuum's snapshot
  might be slightly stale, it could attempt to access an already-dropped temp
  namespace, resulting in an assertion failure or null-pointer dereference.  (In
  practice, since we don't drop temp namespaces automatically but merely recycle
  them, this situation could only arise if a superuser does a manual drop of a
  temp namespace.  Still, that should be allowed.)  The core of the bug, IMO, is
  that isTempNamespaceInUse and its callers failed to think hard about whether
  to treat "temp namespace isn't there" differently from "temp namespace isn't
  in use".  In hopes of forestalling future mistakes of the same ilk, replace
  that function with a new one checkTempNamespaceStatus, which makes the same
  tests but returns a three-way enum rather than just a bool.
  isTempNamespaceInUse is gone entirely in HEAD; but just in case some external
  code is relying on it, keep it in the back branches, as a bug-compatible
  wrapper around the new function.  Per report originally from Prabhat Kumar
  Sahu, investigated by Mahendra Singh and Michael Paquier; the final form of
  the patch is my fault. This replaces the failed fix attempt in a052f6cbb.
  Backpatch as far as v11, as 246a6c8f7 was.  Discussion:
  https://postgr.es/m/CAKYtNAr9Zq=1-ww4etHo-VCC-k120YxZy5OS01VkaLPaDbv2tg@mail.gmail.com
  https://git.postgresql.org/pg/commitdiff/80d76be51cfb7f8f54e779ec5a287d80ac28b6e4

- Remove obsolete comment. Noted while studying subplan hash issue.
  https://git.postgresql.org/pg/commitdiff/6afc8aefd3b95c0c4e7d07d2c99b90ce83e313de

- Correctly re-use hash tables in buildSubPlanHash(). Commit 356687bd8 omitted
  to remove leftover code for destroying a hashed subplan's hash tables, with
  the result that the tables were always rebuilt not reused; this leads to
  severe memory leakage if a hashed subplan is re-executed enough times.
  Moreover, the code for reusing the hashnulls table had a typo that would have
  made it do the wrong thing if it were reached.  Looking at the code coverage
  report shows severe under-coverage of the potential callers of
  ResetTupleHashTable, so add some test cases that exercise them.  Andreas
  Karlsson and Tom Lane, per reports from Ranier Vilela and Justin Pryzby.
  Backpatch to v11, as the faulty commit was.  Discussion:
  https://postgr.es/m/edb62547-c453-c35b-3ed6-a069e4d6b937@proxel.se Discussion:
  https://postgr.es/m/CAEudQAo=DCebm1RXtig9OH+QivpS97sMkikt0A9qHmMUs+g6ZA@mail.gmail.com
  Discussion: https://postgr.es/m/20200210032547.GA1412@telsasoft.com
  https://git.postgresql.org/pg/commitdiff/58c47ccfff20b8c125903482725c1dbfd30beade

Andres Freund pushed:

- expression eval: Reduce number of steps for agg transition invocations. Do so
  by combining the various steps that are part of aggregate transition function
  invocation into one larger step. As some of the current steps are only
  necessary for some aggregates, have one variant of the aggregate transition
  step for each possible combination.  To avoid further manual copies of code in
  the different transition step implementations, move most of the code into
  helper functions marked as "always inline".  The benefit of this change is an
  increase in performance when aggregating lots of rows. This comes in part due
  to the reduced number of indirect jumps due to the reduced number of steps,
  and in part by reducing redundant setup code across steps. This mainly
  benefits interpreted execution, but the code generated by JIT is also improved
  a bit.  As a nice side-effect it also ends up making the code a bit simpler.
  A small additional optimization is removing the need to set
  aggstate->curaggcontext before calling ExecAggInitGroup, choosing to instead
  passign curaggcontext as an argument. It was, in contrast to other aggregate
  related functions, only needed to fetch a memory context to copy the
  transition value into.  Author: Andres Freund Discussion:
  https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
  https://postgr.es/m/5c371df7cee903e8cd4c685f90c6c72086d3a2dc.camel@j-davis.com
  https://git.postgresql.org/pg/commitdiff/2742c45080077ed3b08b810bb96341499b86d530

Magnus Hagander pushed:

- Include error code in message from pg_upgrade. In passing, also quote the
  filename in one message where it wasn't.  Author: Dagfinn Ilmari Mannsåker
  Discussion: https://postgr.es/m/87pne2w98h.fsf@wibble.ilmari.org
  https://git.postgresql.org/pg/commitdiff/4109bb5de4998b9301ea2ac18c9d6dfb0b4f900b

Peter Geoghegan pushed:

- Add equalimage B-Tree support functions. Invent the concept of a B-Tree
  equalimage ("equality implies image equality") support function, registered as
  support function 4.  This indicates whether it is safe (or not safe) to apply
  optimizations that assume that any two datums considered equal by an operator
  class's order method must be interchangeable without any loss of semantic
  information. This is static information about an operator class and a
  collation.  Register an equalimage routine for almost all of the existing
  B-Tree opclasses.  We only need two trivial routines for all of the opclasses
  that are included with the core distribution.  There is one routine for
  opclasses that index non-collatable types (which returns 'true'
  unconditionally), plus another routine for collatable types (which returns
  'true' when the collation is a deterministic collation).  This patch is
  infrastructure for an upcoming patch that adds B-Tree deduplication.  Author:
  Peter Geoghegan, Anastasia Lubennikova Discussion:
  https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
  https://git.postgresql.org/pg/commitdiff/612a1ab76724aa1514b6509269342649f8cab375

- Add deduplication to nbtree. Deduplication reduces the storage overhead of
  duplicates in indexes that use the standard nbtree index access method.  The
  deduplication process is applied lazily, after the point where opportunistic
  deletion of LP_DEAD-marked index tuples occurs.  Deduplication is only applied
  at the point where a leaf page split would otherwise be required.  New posting
  list tuples are formed by merging together existing duplicate tuples.  The
  physical representation of the items on an nbtree leaf page is made more space
  efficient by deduplication, but the logical contents of the page are not
  changed.  Even unique indexes make use of deduplication as a way of
  controlling bloat from duplicates whose TIDs point to different versions of
  the same logical table row.  The lazy approach taken by nbtree has significant
  advantages over a GIN style eager approach.  Most individual inserts of index
  tuples have exactly the same overhead as before.  The extra overhead of
  deduplication is amortized across insertions, just like the overhead of page
  splits.  The key space of indexes works in the same way as it has since commit
  dd299df8 (the commit that made heap TID a tiebreaker column).  Testing has
  shown that nbtree deduplication can generally make indexes with about 10 or 15
  tuples for each distinct key value about 2.5X - 4X smaller, even with single
  column integer indexes (e.g., an index on a referencing column that
  accompanies a foreign key).  The final size of single column nbtree indexes
  comes close to the final size of a similar contrib/btree_gin index, at least
  in cases where GIN's posting list compression isn't very effective.  This can
  significantly improve transaction throughput, and significantly reduce the
  cost of vacuuming indexes.  A new index storage parameter (deduplicate_items)
  controls the use of deduplication.  The default setting is 'on', so all new
  B-Tree indexes automatically use deduplication where possible.  This decision
  will be reviewed at the end of the Postgres 13 beta period.  There is a
  regression of approximately 2% of transaction throughput with synthetic
  workloads that consist of append-only inserts into a table with several
  non-unique indexes, where all indexes have few or no repeated values.  The
  underlying issue is that cycles are wasted on unsuccessful attempts at
  deduplicating items in non-unique indexes. There doesn't seem to be a way
  around it short of disabling deduplication entirely.  Note that deduplication
  of items in unique indexes is fairly well targeted in general, which avoids
  the problem there (we can use a special heuristic to trigger deduplication
  passes in unique indexes, since we're specifically targeting "version bloat").
  Bump XLOG_PAGE_MAGIC because xl_btree_vacuum changed.  No bump in
  BTREE_VERSION, since the representation of posting list tuples works in a way
  that's backwards compatible with version 4 indexes (i.e. indexes built on
  PostgreSQL 12).  However, users must still REINDEX a pg_upgrade'd index to use
  deduplication, regardless of the Postgres version they've upgraded from.  This
  is the only way to set the new nbtree metapage flag indicating that
  deduplication is generally safe.  Author: Anastasia Lubennikova, Peter
  Geoghegan Reviewed-By: Peter Geoghegan, Heikki Linnakangas Discussion:
  https://postgr.es/m/55E4051B.7020209@postgrespro.ru
  https://postgr.es/m/4ab6e2db-bcee-f4cf-0916-3a06e6ccbb55@postgrespro.ru
  https://git.postgresql.org/pg/commitdiff/0d861bbb702f8aa05c2a4e3f1650e7e8df8c8c27

- Silence compiler warning in nbtinsert.c. Per buildfarm member longfin.
  https://git.postgresql.org/pg/commitdiff/2d8a6fad18fcec1850b79ddcf3a0032058d86715

- Silence another compiler warning in nbtinsert.c. Per complaint from Álvaro
  Herrera.
  https://git.postgresql.org/pg/commitdiff/2c0797da2c26dd5574e0d9128887c1fc2a385c25

- Doc: Fix deduplicate_items index term. Reported-By: Fujii Masao Discussion:
  https://postgr.es/m/18f07ae8-7d89-537c-b0a9-54100a1b46da@oss.nttdata.com
  https://git.postgresql.org/pg/commitdiff/e537aed61db767745b614600be15cd19bb581330

- Teach pageinspect about nbtree deduplication. Add a new bt_metap() column to
  display the metapage's allequalimage field.  Also add three new columns to
  contrib/pageinspect's bt_page_items() function:  * Add a boolean column
  ("dead") that displays the LP_DEAD bit value for each non-pivot tuple.  * Add
  a TID column ("htid") that displays a single heap TID value for each tuple.
  This is the TID that is returned by BTreeTupleGetHeapTID(), so comparable
  values are shown for pivot tuples, plain non-pivot tuples, and posting list
  tuples.  * Add a TID array column ("tids") that displays TIDs from each
  tuple's posting list, if any.  This works just like the "tids" column from
  pageinspect's gin_leafpage_items() function.  No version bump for the
  pageinspect extension, since there hasn't been a stable Postgres release since
  the last version bump (the last bump was part of commit 58b4cb30).  Author:
  Peter Geoghegan Discussion:
  https://postgr.es/m/CAH2-WzmSMmU2eNvY9+a4MNP+z02h6sa-uxZvN3un6jY02ZVBSw@mail.gmail.com
  https://git.postgresql.org/pg/commitdiff/93ee38eade1b2b4964354b95b01b09e17d6f098d

- Doc: Fix pageinspect bt_page_items() example. Oversight in commit 93ee38ea.
  https://git.postgresql.org/pg/commitdiff/dba915333967609cc7ca73f0904dfdc08db0a052

Álvaro Herrera pushed:

- Fix docs regarding AFTER triggers on partitioned tables. In commit
  86f575948c77 I forgot to update the trigger.sgml paragraph that needs to
  explain that AFTER triggers are allowed in partitioned tables.  Do so now.
  Discussion: https://postgr.es/m/20200224185850.GA30899@alvherre.pgsql
  https://git.postgresql.org/pg/commitdiff/3acfe6b089a5283c2961eb2ff135106e23b89263

- Record parents of triggers. This let us get rid of a recently introduced ugly
  hack (commit 1fa846f1c9af).  Author: Álvaro Herrera Reviewed-by: Amit Langote,
  Tom Lane Discussion: https://postgr.es/m/20200217215641.GA29784@alvherre.pgsql
  https://git.postgresql.org/pg/commitdiff/b9b408c487244ef8e6d613d183c56eb2c62990b1

- Catversion bump for b9b408c48724. Per Tom Lane.
  https://git.postgresql.org/pg/commitdiff/afb5465e0cfce7637066eaaaeecab30b0f23fbe3

- Add comments on avoid reuse of parse-time snapshot. Apparently, reusing the
  parse-time query snapshot for later steps (execution) is a frequently
  considered optimization ... but it doesn't work, for reasons discovered in
  thread [1].  Adding some comments about why it doesn't really work can relieve
  some future hackers from wasting time reimplementing it again.  [1]
  https://postgr.es/m/flat/5075D8DF.6050500@fuzzy.cz  Author: Michail Nikolaev
  Discussion:
  https://postgr.es/m/CANtu0ogp6cTvMJObXP8n=k+JtqxY1iT9UV5MbGCpjjPa5crCiw@mail.gmail.com
  https://git.postgresql.org/pg/commitdiff/db989184cda7f4aa1ff764cca96142029e7e093b

Jeff Davis pushed:

- Save calculated transitionSpace in Agg node. This will be useful in the
  upcoming Hash Aggregation work to improve estimates for hash table sizing.
  Discussion:
  https://postgr.es/m/37091115219dd522fd9ed67333ee8ed1b7e09443.camel%40j-davis.com
  https://git.postgresql.org/pg/commitdiff/c11cb17dc55a6b432dc637491a4491920f9c2770

- Fix commit c11cb17d. I neglected to update copyfuncs/outfuncs/readfuncs.
  Discussion: https://postgr.es/m/12491.1582833409%40sss.pgh.pa.us
  https://git.postgresql.org/pg/commitdiff/32bb4535a0be892f4a2794a26ba429032f4ae51e

Dean Rasheed pushed:

- Fix corner-case loss of precision in numeric ln(). When deciding on the local
  rscale to use for the Taylor series expansion, ln_var() neglected to account
  for the fact that the result is subsequently multiplied by a factor of
  2^(nsqrt+1), where nsqrt is the number of square root operations performed in
  the range reduction step, which can be as high as 22 for very large inputs.
  This could result in a loss of precision, particularly when combined with
  large rscale values, for which a large number of Taylor series terms is
  required (up to around 400).  Fix by computing a few extra digits in the
  Taylor series, based on the weight of the multiplicative factor
  log10(2^(nsqrt+1)). It remains to be proven whether or not the other 8 extra
  digits used for the Taylor series is appropriate, but this at least deals with
  the obvious oversight of failing to account for the effects of the final
  multiplication.  Per report from Justin AnyhowStep. Reviewed by Tom Lane.
  Discussion: https://postgr.es/m/16280-279f299d9c06e56f@postgresql.org
  https://git.postgresql.org/pg/commitdiff/43a899f41f46918a0bf442edb091b08c214c68f8

== Pending Patches ==

Dagfinn Ilmari Mannsåker sent in another revision of a patch to fix parallel
make of libpq.

Peter Eisentraut sent in a patch to remove win32ver.rc from version_stamp.pl.

Peter Eisentraut sent in a patch to make the bitmap of updated columns available
to a trigger in TriggerData.

Andy Fan sent in another revision of a patch to erase the distinctClause if the
result is unique by definition.

Tom Lane sent in another revision of a patch to account honestly for external FD
usage.

Alexander Korotkov sent in another revision of a patch to improve checking for
missing parent links.

Tom Lane sent in another revision of a patch to implement unicode escapes with
any backend encoding.

Peter Eisentraut sent in another revision of a patch to propagate ALTER TABLE
... SET STORAGE to indexes.

Juan José Santamaría Flecha sent in another revision of a patch to fix command
line colorization on windows.

Thomas Munro sent in a patch to use FeBeWaitSet for GSSAPI and SSL waits.

Antonin Houska sent in a patch to move two variables in RelationCopyStorage() to
a narrower scope.

Antonin Houska sent in a patch to change an Assertion form "not persistent" to
"is ephemeral".

Masahiko Sawada and Cary Huang traded patches to implement an internal key
management system.

Rémi Lapeyre sent in another revision of a patch to enable COPY "test" to output
a header and add header matching mode to COPY FROM.

Zeng Wenjing sent in four more revisions of a patch to implement global
temporary tables.

David Steele sent in a patch to skip pg_internal.init.

Konstantin Knizhnik sent in a patch to port vectorize_engine to zedstore.

Michael Banck sent in a patch to fix checksum verification in base backups for
random or zero page headers.

Tom Lane and Andrew Dunstan traded patches to help transition PL/PythonU from 2
to 3.

Imai Yoshikazu sent in another revision of a patch to add a pg_stat_waitaccum
view and change the measuring method of wait event time from INSTR_TIME to
rdtsc.

Daniel Gustafsson sent in two more revisions of a patch to use
heap_multi_insert() for pg_attribute/depend insertions.

Alexander Korotkov sent in six more revisions of a patch to pg_rewind to add
options to restore WAL files from archives.

Justin Pryzby sent in two more revisions of a patch to make HashAggregate report
bucket and memory stats.

Fujii Masao sent in another revision of a patch to see to it that Wait event is
reported while waiting for WAL archiving to finish.

Fujii Masao sent in another revision of a patch to implement a
pg_stat_progress_basebackup view.

Dagfinn Ilmari Mannsåker sent in three revisions of a patch to make pg_upgrade's
version file parsing error message consistent.

Julien Rouhaud sent in two more revisions of a patch to implement collation
versioning.

David Cramer sent in three more revisions of a patch to throw an error on failed
COMMIT.

Yugo Nagata sent in a patch to the auto_explain contrib extension which makes it
possible to log plans before the queries are executed.

Yuzuko Hosoya sent in another revision of a patch to fix an infelicity between
autovacuum and partitioned tables.

Masahiko Sawada sent in another revision of a patch to identify user-created
objects.

John Naylor and David Fetter traded patches to truncate timestamps on arbitrary
intervals.

David Fetter sent in two more revisions of a patch to use compiler intrinsics
for bit operations in hashing.

Masahiko Sawada sent in two more revisions of a patch to improve wait events of
recovery conflict resolution.

Pavel Stěhule sent in three more revisions of a patch to implement schema
variables.

Thomas Munro sent in another revision of a patch to prevent WaitEventSet syscall
churn.

Jeff Davis sent in another revision of a patch to implement memory-bounded hash
aggregation.

Robert Haas sent in another revision of a patch to implement backup manifests.

Nikita Glukhov sent in another revision of a patch to implement JSONPATH syntax
extensions.

Álvaro Herrera sent in a patch to implement BEFORE ROW triggers for partitioned
tables.

Jeff Janes sent in a patch to psql to enable tab completion for ALTER TEXT
SEARCH DICTIONARY.

Jeff Janes sent in a patch to add an absolute value option to the dict_int
function.

Justin Pryzby sent in another revision of a patch to include the block number in
the error context for vacuum.

Kyotaro HORIGUCHI sent in two revisions of a patch to add a TAP test for a crash
bug, and fix said crash bug in targeted promotion.

Takuma Hoshiai sent in another revision of a patch to implement incremental
maintenance of materialized views.

Jeff Davis sent in a patch to add a function that makes it possible to add new
tapes dynamically.

Peter Eisentraut sent in a patch to improve handling of parameter differences in
physical replication.

Tom Lane sent in two revisions of a patch to ensure less-silly selectivity for
JSONB matching operators.

Tom Lane sent in another revision of a patch to revisit opclass member checks
and dependency strength.

Kyotaro HORIGUCHI sent in another revision of a patch to allow wait event sets
to be registered to resource owners, add some infrastructure for asynchronous
execution, and use same for the PostgreSQL FDW.

Justin Pryzby sent in another revision of a patch to emit a more specific error
message when failing to alter a partitioned index, and allow reloptions on
partitioned tables and indexes.

Kyotaro HORIGUCHI sent in a patch to make end-of-recovery errors less scary.

Masahiko Sawada sent in another revision of a patch to add a RESUME option to
VACUUM and autovacuum.

Richard Guo sent in a PoC patch to pull up EXPR_SUBLINK.

Peter Eisentraut sent in a patch to remove HAVE_WORKING_LINK.

Juan José Santamaría Flecha sent in three more revisions of a patch to support
large files on Win32.

Álvaro Herrera sent in another revision of a patch to migrate commandTag from
string to enum.

Tomáš Vondra sent in two revisions of a patch to enable ALTER TYPE to change
storage strategy.

Noah Misch sent in another revision of a patch to fix the WAL-skipping feature.

Nikita Glukhov sent in a patch to introduce opclass parameters and use same.

Dilip Kumar sent in another revision of a patch to remove an infelicity between
logical_work_mem and logical streaming of large in-progress transactions.

David Zhang sent in another revision of a patch to fix psql to report IO errors
on writing query results.

Aleksey Kondratov and Justin Pryzby traded patches to Allow CLUSTER, VACUUM FULL
and REINDEX to change tablespace on the fly.

David Cramer sent in another revision of a patch to implement binary support for
the pgoutput plugin.

Justin Pryzby sent in another revision of a patch to fix the fact that ALTER tbl
rewrite loses CLUSTER ON index.

Tomáš Vondra sent in another revision of a patch to implement SLRU statistics.

Andrey Borodin sent in two more revisions of a patch to add sort support for
point gist_point_sortsupport, implement GiST build using sort support, and add a
function relopt for gist build.

Chis Bandy sent in a patch to add schema and table names to partition error.

Andreas Karlsson sent in another revision of a patch to fix the process of
resetting of subplan hash tables.

Fabien COELHO sent in another revision of a patch to add some barriers to
pgbench.

Ivan Panchenko sent in two revisions of a patch to make a transform which
transforms bool for PL/Perl[U].

Tom Lane sent in two more revisions of a patch to allow localized month names in
to_date and to_timestamp.

KaiGai Kohei sent in another revision of a patch to enable TRUNCATE on foreign
tables.

Mike Palmiotto sent in another revision of a patch to coalesce all of the
postmaster subprocess startups into one central framework, and use same to add a
hook to allow extensions to set worker metadata.

Andres Freund sent in a patch to make connections scale better by looking at
PGXACT->xmin a lot less frequently.

Amit Langote sent in another revision of a patch to fix partitionwise join to
handle FULL JOINs correctly.

Pavel Stěhule sent in a patch to add a reglanguage type.

Floris Van Nee sent in another revision of a patch to avoid a pipeline stall in
_bt_compare(), and inline _bt_compare().




pgsql-announce by date:

Previous
From: David Christensen
Date:
Subject: Bucardo 5.6.0 released!
Next
From: "Andrey M. Borodin"
Date:
Subject: Odyssey 1.1