Re: Incorrect result of bitmap heap scan. - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Incorrect result of bitmap heap scan. |
Date | |
Msg-id | CAH2-WznGntS+Q_5Eye4bupdsqO0SqU5as=1-Ku9x+vKPg9pvJw@mail.gmail.com Whole thread Raw |
In response to | Re: Incorrect result of bitmap heap scan. (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
On Mon, Dec 2, 2024 at 11:32 AM Andres Freund <andres@anarazel.de> wrote: > On 2024-12-02 16:08:02 +0100, Matthias van de Meent wrote: > > Concurrency timeline: > > > > Session 1. amgetbitmap() gets snapshot of index contents, containing > > references to dead tuples on heap P1. > > IIUC, an unstanted important aspect here is that these dead tuples are *not* > visible to S1. Otherwise the VACUUM in the next step could not remove the dead > tuples. I would state the same thing slightly differently, FWIW: I would say that the assumption that has been violated is that a TID is a stable identifier for a given index tuple/logical row/row version (stable for the duration of the scan). This emphasis/definition seems slightly more useful, because it makes it clear why this is hard to hit: you have to be fairly unlucky for a dead-to-everyone TID to be returned by your scan (no LP_DEAD bit can be set) and set in the scan's bitmap, only to later be concurrently set LP_UNUSED in the heap -- all without VACUUM randomly being prevented from setting the same page all-visible due to some unrelated not-all-visible heap item making it unsafe. > Ugh :/ > > Pretty depressing that we've only found this now, ~6 years after it's been > released. We're lacking some tooling to find this stuff in a more automated > fashion. FWIW I have suspicions about similar problems with index-only scans that run in hot standby, and about all GiST index-only scans: https://postgr.es/m/CAH2-Wz=PqOziyRSrnN5jAtfXWXY7-BJcHz9S355LH8Dt=5qxWQ@mail.gmail.com In general there seems to be a lack of rigor in this area. I'm hoping that Tomas Vondra's work can tighten things up here in passing. > It seems pretty much impossible to fix that with the current interaction > between nodeBitmap* and indexam. I wonder if we could, at least in some cases, > and likely with some heuristics / limits, address this by performing some > visibility checks during the bitmap build. I'm bringing it up here because I > suspect that some of the necessary changes would overlap with what's needed to > repair bitmap index-only scans. This seems like it plays into some of the stuff I've discussed with Tomas, about caching visibility information in local state, as a means to avoiding holding onto pins in the index AM. For things like mark/restore support. > > This is quite non-trivial, however, as we don't have much in place regarding > > per-relation shared structures which we could put such a value into, nor a > > good place to signal changes of the value to bitmap heap-scanning backends. > > It doesn't have to be driven of table state, it could be state in > indexes. Most (all?) already have a metapage. FWIW GiST doesn't have a metapage (a historical oversight). > Unfortunately I don't see a better path either. > > I think it'd be good if we added a test that shows the failure mechanism so > that we don't re-introduce this in the future. Evidently this failure isn't > immediately obvious... Maybe a good use for injection points? -- Peter Geoghegan
pgsql-hackers by date: