Re: SimpleLruTruncate() mutual exclusion - Mailing list pgsql-hackers
From | Dmitry Dolgov |
---|---|
Subject | Re: SimpleLruTruncate() mutual exclusion |
Date | |
Msg-id | 20191117115652.doqc77yu7kbrqwmk@localhost Whole thread Raw |
In response to | Re: SimpleLruTruncate() mutual exclusion (Noah Misch <noah@leadboat.com>) |
Responses |
Re: SimpleLruTruncate() mutual exclusion
|
List | pgsql-hackers |
> On Wed, Jul 31, 2019 at 11:51:17PM -0700, Noah Misch wrote: Hi, > > Also, if I understand the data-loss hazard properly, it's what you > > said in the other thread: the latest_page_number could advance after > > we make our decision about what to truncate, and then maybe we could > > truncate newly-added data. We surely don't want to lock out the > > operations that can advance last_page_number, so how does serializing > > vac_truncate_clog help? > > > > I wonder whether what we need to do is add some state in shared > > memory saying "it is only safe to create pages before here", and > > make SimpleLruZeroPage or its callers check that. The truncation > > logic would advance that value, but only after completing a scan. > > (Oh ..., hmm, maybe the point of serializing truncations is to > > ensure that we know that we can safely advance that?) > > vac_truncate_clog() already records "it is only safe to create pages before > here" in ShmemVariableCache->xidWrapLimit, which it updates after any unlinks. > The trouble comes when two vac_truncate_clog() run in parallel and you get a > sequence of events like this: > > vac_truncate_clog() instance 1 starts, considers segment ABCD eligible to unlink > vac_truncate_clog() instance 2 starts, considers segment ABCD eligible to unlink > vac_truncate_clog() instance 1 unlinks segment ABCD > vac_truncate_clog() instance 1 calls SetTransactionIdLimit() > vac_truncate_clog() instance 1 finishes > some backend calls SimpleLruZeroPage(), creating segment ABCD > vac_truncate_clog() instance 2 unlinks segment ABCD > > Serializing vac_truncate_clog() fixes that. I'm probably missing something, so just wanted to clarify. Do I understand correctly, that thread [1] and this one are independent, and it is assumed in the scenario above that we're at the end of XID space, but not necessarily having rounding issues? I'm a bit confused, since the reproduce script does something about cutoffPage, and I'm not sure if it's important or not. > > Can you post whatever script you used to detect/reproduce the problem > > in the first place? > > I've attached it, but it's pretty hackish. Apply the patch on commit 7170268, > make sure your --bindir is in $PATH, copy "conf-test-pg" to your home > directory, and run "make trunc-check". This incorporates xid-burn > acceleration code that Jeff Janes shared with -hackers at some point. I've tried to use it to understand the problem better, but somehow couldn't reproduce via suggested script. I've applied it to 7170268 (tried also apply rebased to the master with the same results) with the conf-test-pg in place, but after going through all steps there are no errors like: ERROR: could not access status of transaction 2149484247 DETAIL: Could not open file "pg_xact/0801": No such file or directory. Is there anything extra one needs to do to reproduce the problem, maybe adjust delays or something? [1]: https://www.postgresql.org/message-id/flat/20190202083822.GC32531%40gust.leadboat.com
pgsql-hackers by date: