Re: BUG #17268: Possible corruption in toast index after reindex index concurrently - Mailing list pgsql-bugs

From Peter Geoghegan
Subject Re: BUG #17268: Possible corruption in toast index after reindex index concurrently
Date
Msg-id CAH2-WznMD0TOL5njf0udBZdDpagu+ZCLE9E-TpLXqezvPy6Z9A@mail.gmail.com
Whole thread Raw
In response to Re: BUG #17268: Possible corruption in toast index after reindex index concurrently  (Maxim Boguk <maxim.boguk@gmail.com>)
List pgsql-bugs
On Thu, Nov 4, 2021 at 12:47 PM Maxim Boguk <maxim.boguk@gmail.com> wrote:
> now... and yes during the time of error page 59561917 was very close
> to the end of the table.
> There was a high chance (but not 100%) that the corresponding main
> table entry had been inserted during reindex CONCURRENTLY of the toast
> index run.

It might be useful if you located the leaf page that the missing index
tuple is supposed to be on. It's possible that there is a recognizable
pattern. If you knew the block number of the relevant leaf page in the
index already, then you could easily dump the relevant page to a small
file, and share it with us here. The usual procedure is described
here:


https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#contrib.2Fpageinspect_page_dump

The tricky part is figuring out which block number is the one of
interest. I can't think of any easy way of doing that in a production
database. The easiest approach I can think of is to use
pg_buffercache. Restart the database (or more likely an instance of
the database that has the problem, but isn't the live production
database). Then write a query like this:

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM pg_toast_2624976286 WHERE
chunk_id = 4040061139;

(The BUFFERS stuff is to verify that you got buffer hits.)

Next query the blocks that you see in pg_buffercache:

SELECT relblocknumber FROM pg_buffercache WHERE relfilenode =
pg_relation_filenode('pg_toast.pg_toast_2624976286_index');

Finally, note down the block numbers that the query returns. There
will probably be 3 or 4. Just send me any that are non-0 (that's just
the metapage). I only really care about the leaf page, but I can
figure that part out myself when I have the pages you access.

(It's a pity there isn't a less cumbersome procedure here.)

-- 
Peter Geoghegan



pgsql-bugs by date:

Previous
From: Maxim Boguk
Date:
Subject: Re: BUG #17268: Possible corruption in toast index after reindex index concurrently
Next
From: Andres Freund
Date:
Subject: Re: BUG #17245: Index corruption involving deduplicated entries