Re: SimpleLruTruncate() mutual exclusion - Mailing list pgsql-hackers
From | Dmitry Dolgov |
---|---|
Subject | Re: SimpleLruTruncate() mutual exclusion |
Date | |
Msg-id | 20191122153222.jeb2jsezhso36obu@localhost Whole thread Raw |
In response to | Re: SimpleLruTruncate() mutual exclusion (Noah Misch <noah@leadboat.com>) |
List | pgsql-hackers |
> On Sun, Nov 17, 2019 at 10:14:26PM -0800, Noah Misch wrote: > > Though I did reproduce this bug, I'm motivated by the abstract problem more > than any particular way to reproduce it. Commit 996d273 inspired me; by > removing a GetCurrentTransactionId(), it allowed the global xmin to advance at > times it previously could not. That subtly changed the concurrency > possibilities. I think safe, parallel SimpleLruTruncate() is difficult to > maintain and helps too rarely to justify such maintenance. That's why I > propose eliminating the concurrency. Sure, I see the point and the possibility for the issue itself, but of course it's easier to reason about an issue I can reproduce :) > I wonder about performance in a database with millions of small relations, > particularly considering my intent to back-patch this. In such databases, > vac_update_datfrozenxid() can be a major part of the VACUUM's cost. Two > things work in our favor. First, vac_update_datfrozenxid() runs once per > VACUUM command, not once per relation. Second, Autovacuum has this logic: > > * ... we skip > * this if (1) we found no work to do and (2) we skipped at least one > * table due to concurrent autovacuum activity. In that case, the other > * worker has already done it, or will do so when it finishes. > */ > if (did_vacuum || !found_concurrent_worker) > vac_update_datfrozenxid(); > > That makes me relatively unworried. I did consider some alternatives: Btw, I've performed few experiments with parallel vacuuming of 10^4 small tables that are taking some small inserts, the results look like this: # with patch # funclatency -u bin/postgres:vac_update_datfrozenxid usecs : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 3 |*** | 1024 -> 2047 : 38 |****************************************| 2048 -> 4095 : 15 |*************** | 4096 -> 8191 : 15 |*************** | 8192 -> 16383 : 2 |** | # without patch # funclatency -u bin/postgres:vac_update_datfrozenxid usecs : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 5 |**** | 1024 -> 2047 : 49 |****************************************| 2048 -> 4095 : 11 |******** | 4096 -> 8191 : 5 |**** | 8192 -> 16383 : 1 | | In general it seems that latency tends to be a bit higher, but I don't think it's significant.
pgsql-hackers by date: