Re: remove spurious CREATE INDEX CONCURRENTLY wait - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: remove spurious CREATE INDEX CONCURRENTLY wait
Date
Msg-id 20201116182446.qcg3o6szo2zookyr@localhost
Whole thread Raw
In response to Re: remove spurious CREATE INDEX CONCURRENTLY wait  (Michael Paquier <michael@paquier.xyz>)
Responses Re: remove spurious CREATE INDEX CONCURRENTLY wait
Re: remove spurious CREATE INDEX CONCURRENTLY wait
List pgsql-hackers
> On Fri, Nov 13, 2020 at 09:25:40AM +0900, Michael Paquier wrote:
> On Thu, Nov 12, 2020 at 04:36:32PM +0100, Dmitry Dolgov wrote:
> > Interesting enough, similar discussion happened about vaccumFlags before
> > with the same conclusion that theoretically it's fine to update without
> > holding the lock, but this assumption could change one day and it's
> > better to avoid such risks. Having said that I believe it makes sense to
> > continue with locking. Are there any other opinions? I'll try to
> > benchmark it in the meantime.
>
> Thanks for planning some benchmarking for this specific patch.  I have
> to admit that the possibility of switching vacuumFlags to use atomics
> is very appealing in the long term, with or without considering this
> patch, even if we had better be sure that this patch has no actual
> effect on concurrency first if atomics are not used in worst-case
> scenarios.

I've tried first to test scenarios where GetSnapshotData produces
significant lock contention and "reindex concurrently" implementation
with locks interferes with it. The idea I had is to create a test
function that constantly calls GetSnapshotData (perf indeed shows
significant portion of time spent on contended lock), and clash it with
a stream of "reindex concurrently" of an empty relation (which still
reaches safe_index check). I guess it could be considered as an
artificial extreme case. Measuring GetSnapshotData (or rather the
surrounding wrapper, to distinguish calls from the test function from
everything else) latency without reindex, with reindex and locks, with
reindex without locks should produce different "modes" and comparing
them we can make some conclusions.

Latency histograms without reindex (nanoseconds):

     nsecs               : count     distribution
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 10001209 |****************************************|
      2048 -> 4095       : 76936    |                                        |
      4096 -> 8191       : 1468     |                                        |
      8192 -> 16383      : 98       |                                        |
     16384 -> 32767      : 39       |                                        |
     32768 -> 65535      : 6        |                                        |

The same with reindex without locks:

     nsecs               : count     distribution
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 111345   |                                        |
      2048 -> 4095       : 6997627  |****************************************|
      4096 -> 8191       : 18575    |                                        |
      8192 -> 16383      : 586      |                                        |
     16384 -> 32767      : 312      |                                        |
     32768 -> 65535      : 18       |                                        |

The same with reindex with locks:

     nsecs               : count     distribution
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 59438    |                                        |
      2048 -> 4095       : 6901187  |****************************************|
      4096 -> 8191       : 18584    |                                        |
      8192 -> 16383      : 581      |                                        |
     16384 -> 32767      : 280      |                                        |
     32768 -> 65535      : 84       |                                        |

Looks like with reindex without locks is indeed faster (there are mode
samples in lower time section), but not particularly significant to the
whole distribution, especially taking into account extremity of the
test.

I'll take a look at benchmarking of switching vacuumFlags to use
atomics, but as it's probably a bit off topic I'm going to attach
another version of the patch with locks and suggested changes. To which
I have one question:

> Michael Paquier <michael@paquier.xyz> writes:

> I think that this should be in its own routine, and that we had better
> document that this should be called just after starting a transaction,
> with an assertion enforcing that.

I'm not sure which exactly assertion condition do you mean?

Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: planner support functions: handle GROUP BY estimates ?
Next
From: Simon Riggs
Date:
Subject: Re: remove spurious CREATE INDEX CONCURRENTLY wait