The commit of this patch seems to have created a bug in which updated tuples can disappear from the index, while remaining in the table.
It looks like the bug depends on going through a crash-recovery cycle, but I am not sure of that yet.
I've looked through the commit diff and don't see anything obviously wrong. I notice index tuples are marked dead with only a buffer content share lock, and the page is defragmented with only a buffer exclusive lock (as opposed to a super-exclusive buffer clean up lock). But as far as I can tell, both of those should be safe on an index. Also, if that was the bug, it should happen without crash-recovery.
The test is pretty simple. I create a 10,000 row table with a unique-by-construction id column with a btree_gist index on it and a counter column, and fire single-row updates of the counter for random ids in high concurrency (8 processes running flat out). I force the server to crash frequently with simulated torn-page writes in which md.c writes a partial page and then PANICs. Eventually (1 to 3 hours) the updates start indicating they updated 0 rows. At that point, a forced table scan will find the row, but the index doesn't.
Any hints on how to proceed with debugging this? If I can't get it to reproduce the problem in the absence of crash-recovery cycles with an overnight run, then I think my next step will be to run it over hot-standby and see if WAL replay in the absence of crashes might be broken as well.