On Sun, Jun 27, 2021 at 11:08 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > That said, the relevant table is the active "alarms" table, and it would've
> > gotten plenty of DML with no issue for months running v13.
>
> It might not have been visibly broken without assertions enabled,
> though. I sprinkled nbtdedup.c with these _bt_posting_valid()
> assertions just because it was easy. The assertions were bound to
> catch some problem sooner or later, and had acceptable overhead.
Obviously nothing stops you from running amcheck on the original
database that you're running in production. You won't need to have
enabled assertions to catch the same problem that way. This seems like
the best way to isolate the problem. I strongly suspect that it's the
LVM issue for my own reasons: nothing changed during the Postgres 14
cycle that seems truly related.
The index deletion stuff (commit d168b666823) might seem like an
obvious possible culprit, but I consider it unlikely. I added many
defensive assertions to that code too: _bt_bottomupdel_pass() also
uses exactly the same kind of _bt_posting_valid() assertions directly.
Plus _bt_delitems_delete_check() is highly defensive with assertions
when it processes a posting list tuple. If there was a problem with
any of that code it seems very likely that those assertions would have
failed first.
--
Peter Geoghegan