Re: Rework the way multixact truncations work - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Rework the way multixact truncations work
Date
Msg-id 20150629194840.GH20882@alap3.anarazel.de
Whole thread Raw
In response to Rework the way multixact truncations work  (Andres Freund <andres@anarazel.de>)
Responses Re: Rework the way multixact truncations work  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

I'm still working on this patch. I found a bunch of issues. Non of them
super critical and some pre-existing; but nonetheless I don't feel like
it's ready to push yet. So we'll have the first alpha without those
fixes :(

The biggest issues so far are:

* There's, both in the posted patch and as-is in all the branches, a lot
  of places that aren't actually safe against a concurrent truncation. A
  bunch of places grab oldestMultiXactId with an lwlock held, release
  it, and then make decisions based on that.

  A bunch of places (including the find_multixact_start callsites) is
  actually vulnerable to that, for a bunch of others it's less likely to
  be a problem.  All callers of GetMultiXactIdMembers() are vulnerable,
  but with the exception of pg_get_multixact_members() they'll never
  pass in a value that's older than the new oldest member value.

  That's a problem for the current branches. Afaics that can lead to a
  useless round of emergency autovacs via
  SetMultiXactIdLimit()->SetOffsetVacuumLimit().

  SetOffsetVacuumLimit() can protect easily agains that by taking the
  new MultiXactTruncationLock lock. We could do the same for
  pg_get_multixact_members() - afaics the only caller that'll look up too
  old values otherwise - but I don't think it matters, you'll get a
  slightly obscure error if you access a too old xact that's just being
  truncated away and that is that.

* There was no update of the in-memory oldest* values when replaying a
  truncation. That's problematic if a standby is promoted after
  replaying a truncation record, but before a following checkpoint
  record. This would be fixed by an emergency autovacuum, but that's
  obviously not nice. Trivial to fix.

* The in-memory oldest values were updated *after* the truncation
  happened. It's unlikely to matter in reality, but it's safer to
  update them before, so a concurrent GetMultiXactIdMembers() of stuff
  from before the truncation will get the proper error.

* PerformMembersTruncation() probably confused Alvaro because it wasn't
  actually correct - there's no need to have the segment containing old
  oldestOffset (in contrast to oldestOffsetAlive) survive. Except
  leaking a segment that's harmless, but obviously not desirable.

Additionally I'm changing some stuff, some requested by review:
* xl_multixact_truncate's members are now called
  (start|end)Trunc(Off|Memb)
* (start|end)TruncOff have the appropriate type now
* typo fixes
* comment improvements
* pgindent

New version attached.

Greetings,

Andres Freund

Attachment

pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: pg_trgm version 1.2
Next
From: Andres Freund
Date:
Subject: Re: 9.5 release notes