Re: SimpleLruTruncate() mutual exclusion - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: SimpleLruTruncate() mutual exclusion
Date
Msg-id 20191117115652.doqc77yu7kbrqwmk@localhost
Whole thread Raw
In response to Re: SimpleLruTruncate() mutual exclusion  (Noah Misch <noah@leadboat.com>)
Responses Re: SimpleLruTruncate() mutual exclusion  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
> On Wed, Jul 31, 2019 at 11:51:17PM -0700, Noah Misch wrote:

Hi,

> > Also, if I understand the data-loss hazard properly, it's what you
> > said in the other thread: the latest_page_number could advance after
> > we make our decision about what to truncate, and then maybe we could
> > truncate newly-added data.  We surely don't want to lock out the
> > operations that can advance last_page_number, so how does serializing
> > vac_truncate_clog help?
> >
> > I wonder whether what we need to do is add some state in shared
> > memory saying "it is only safe to create pages before here", and
> > make SimpleLruZeroPage or its callers check that.  The truncation
> > logic would advance that value, but only after completing a scan.
> > (Oh ..., hmm, maybe the point of serializing truncations is to
> > ensure that we know that we can safely advance that?)
>
> vac_truncate_clog() already records "it is only safe to create pages before
> here" in ShmemVariableCache->xidWrapLimit, which it updates after any unlinks.
> The trouble comes when two vac_truncate_clog() run in parallel and you get a
> sequence of events like this:
>
> vac_truncate_clog() instance 1 starts, considers segment ABCD eligible to unlink
> vac_truncate_clog() instance 2 starts, considers segment ABCD eligible to unlink
> vac_truncate_clog() instance 1 unlinks segment ABCD
> vac_truncate_clog() instance 1 calls SetTransactionIdLimit()
> vac_truncate_clog() instance 1 finishes
> some backend calls SimpleLruZeroPage(), creating segment ABCD
> vac_truncate_clog() instance 2 unlinks segment ABCD
>
> Serializing vac_truncate_clog() fixes that.

I'm probably missing something, so just wanted to clarify. Do I
understand correctly, that thread [1] and this one are independent, and
it is assumed in the scenario above that we're at the end of XID space,
but not necessarily having rounding issues? I'm a bit confused, since
the reproduce script does something about cutoffPage, and I'm not sure
if it's important or not.

> > Can you post whatever script you used to detect/reproduce the problem
> > in the first place?
>
> I've attached it, but it's pretty hackish.  Apply the patch on commit 7170268,
> make sure your --bindir is in $PATH, copy "conf-test-pg" to your home
> directory, and run "make trunc-check".  This incorporates xid-burn
> acceleration code that Jeff Janes shared with -hackers at some point.

I've tried to use it to understand the problem better, but somehow
couldn't reproduce via suggested script. I've applied it to 7170268
(tried also apply rebased to the master with the same results) with the
conf-test-pg in place, but after going through all steps there are no
errors like:

   ERROR:  could not access status of transaction 2149484247
   DETAIL:  Could not open file "pg_xact/0801": No such file or directory.

Is there anything extra one needs to do to reproduce the problem, maybe
adjust delays or something?

[1]: https://www.postgresql.org/message-id/flat/20190202083822.GC32531%40gust.leadboat.com



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: SQL/JSON: JSON_TABLE
Next
From: Magnus Hagander
Date:
Subject: Re: global / super barriers (for checksums)