Home > mailing lists

Re: Improving connection scalability: GetSnapshotData() - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Improving connection scalability: GetSnapshotData()
Date	April 7, 2020 15:15:03
Msg-id	20200407121503.zltbpqmdesurflnm@alap3.anarazel.de Whole thread Raw
In response to	Re: Improving connection scalability: GetSnapshotData() (Andres Freund <andres@anarazel.de>)
Responses	Re: Improving connection scalability: GetSnapshotData() Re: Improving connection scalability: GetSnapshotData() Re: Improving connection scalability: GetSnapshotData() Re: Improving connection scalability: GetSnapshotData()
List	pgsql-hackers

Tree view

Hi,

SEE BELOW: What, and what not, to do for v13.


Attached is a substantially polished version of my patches. Note that
the first three patches, as well as the last, are not intended to be
committed at this time / in this form - they're there to make testing
easier.

There is a lot of polish, but also a few substantial changes:

- To be compatible with old_snapshot_threshold I've revised the way
  heap_page_prune_opt() deals with old_snapshot_threshold. Now
  old_snapshot_threshold is only applied when we otherwise would have
  been unable to prune (both at the time of the pd_prune_xid check, and
  on individual tuples). This makes old_snapshot_threshold considerably
  cheaper and cause less conflicts.

  This required adding a version of HeapTupleSatisfiesVacuum that
  returns the horizon, rather than doing the horizon test itself; that
  way we can first test a tuple's horizon against the normal approximate
  threshold (making it an accurate threshold if needed) and only if that
  fails fall back to old_snapshot_threshold.

  The main reason here was not to improve old_snapshot_threshold, but to
  avoid a regression when its being used. Because we need a horizon to
  pass to old_snapshot_threshold, we'd have to fall back to computing an
  accurate horizon too often.


- Previous versions of the patch had a TODO about computing horizons not
  just for one of shared / catalog / data tables, but all of them at
  once. To avoid potentially needing to determine xmin horizons multiple
  times within one transaction.  For that I've renamed GetOldestXmin()
  to ComputeTransactionHorizons() and added wrapper functions instead of
  the different flag combinations we previously had for GetOldestXmin().

  This allows us to get rid of the PROCARRAY_* flags, and PROC_RESERVED.


- To address Thomas' review comment about not accessing nextFullXid
  without xidGenLock, I made latestCompletedXid a FullTransactionId (a
  fxid is needed to be able to infer 64bit xids for the horizons -
  otherwise there is some danger they could wrap).


- Improving the comment around the snapshot caching, I decided that the
  justification for correctness around not taking ProcArrayLock is too
  complicated (in particular around setting MyProc->xmin). While
  avoiding ProcArrayLock alltogether is a substantial gain, the caching
  itself helps a lot already. Seems best to leave that for a later step.

  This means that the numbers for the very high connection counts aren't
  quite as good.


- Plenty of small changes to address issues I found while
  benchmarking. The only one of real note is that I had released
  XidGenLock after ProcArrayLock in ProcArrayAdd/Remove. For 2pc that
  causes noticable unnecessary contention, because we'll wait for
  XidGenLock while holding ProcArrayLock...


I think this is pretty close to being committable.


But: This patch came in very late for v13, and it took me much longer to
polish it up than I had hoped (partially distraction due to various bugs
I found (in particular snapshot_too_old), partially covid19, partially
"hell if I know"). The patchset touches core parts of the system. While
both Thomas and David have done some review, they haven't for the latest
version (mea culpa).

In many other instances I would say that the above suggests slipping to
v14, given the timing.

The main reason I am considering pushing is that I think this patcheset
addresses one of the most common critiques of postgres, as well as very
common, hard to fix, real-world production issues. GetSnapshotData() has
been a major bottleneck for about as long as I have been using postgres,
and this addresses that to a significant degree.

A second reason I am considering it is that, in my opinion, the changes
are not all that complicated and not even that large. At least not for a
change to a problem that we've long tried to improve.


Obviously we all have a tendency to think our own work is important, and
that we deserve a bit more leeway than others. So take the above with a
grain of salt.


Comments?


Greetings,

Andres Freund

Attachment

pgsql-hackers by date:

From: David Rowley
Date: 07 April 2020, 15:14:16
Subject: Re: Make MemoryContextMemAllocated() more precise

From: Alexander Korotkov
Date: 07 April 2020, 15:16:09
Subject: Re: [HACKERS] make async slave to wait for lsn to be replayed

Re: Improving connection scalability: GetSnapshotData() - Mailing list pgsql-hackers

Attachment

Previous

Next