Thread: Partial index "microvacuum"
So I've been looking at issues we used to have in production some time ago which eventually lead us to migrating away from partial indexes in some cases. In the end, I'm surprised how easy this (or at least a similar case) was to reproduce. The attached program does some UPDATEs where around every third update deletes the row from the partial index since it doesn't match indpred anymore. In that case the row is immediately UPDATEd back to match the index WHERE clause again. This roughly emulates what some of our processes do in production. Today, running the program for a few minutes (until the built-in 262144 iteration limit), I usually end up with a partial index through which producing the only row takes milliseconds on a cold cache, and over a millisecond on a hot one. Finding the row through the primary key is still fast, because the bloat there gets cleaned up. As far as I can tell, after the index has gotten into this state, there's no way to clean it up except VACUUMing the entire table or a REINDEX. Both solutions are pretty bad. My working theory was that this has to do with the fact that HeapTupleSatisfiesMVCC doesn't set the HEAP_XMAX_COMMITTED bit here, but I'm not so sure anymore. Has anyone seen something like this? If that really is what's happening here, then I can see why we wouldn't want to slow down SELECTs with expensive visibility checks. But that really leaves me wishing for something like VACUUM INDEX partial_idx. Otherwise your elephant just keeping getting slower and slower until you get called at 2 AM to play REINDEX. (I've tested this on 9.6, v11 and v13. 13 seems to be a bit better here, but not "fixed", I think.) .m
Attachment
On Wed, Sep 15, 2021 at 7:18 AM Marko Tiikkaja <marko@joh.to> wrote: > So I've been looking at issues we used to have in production some time > ago which eventually lead us to migrating away from partial indexes in > some cases. In the end, I'm surprised how easy this (or at least a > similar case) was to reproduce. > (I've tested this on 9.6, v11 and v13. 13 seems to be a bit better > here, but not "fixed", I think.) What about v14? There were significant changes to the microvacuum/index deletion stuff in that release: https://www.postgresql.org/docs/14/btree-implementation.html#BTREE-DELETION -- Peter Geoghegan
On Wed, Sep 15, 2021 at 7:25 PM Peter Geoghegan <pg@bowt.ie> wrote: > What about v14? There were significant changes to the > microvacuum/index deletion stuff in that release: > > https://www.postgresql.org/docs/14/btree-implementation.html#BTREE-DELETION Huh. Interesting. I'm sorry, I wasn't aware of this work and didn't have version 14 at hand. But it looks like both the partial index as well as the secondary index on (id::text) get cleaned up nicely there. I even tried a version where I have a snapshot open for the entire run, and the subsequents SELECTs clean the bloat up. I'll need to read up on the details a bit to understand exactly what changed, but it appears that at least this particular pattern has already been fixed. Thank you so much for your work on this! .m
On Thu, Sep 16, 2021 at 4:45 AM Marko Tiikkaja <marko@joh.to> wrote: > Huh. Interesting. I'm sorry, I wasn't aware of this work and didn't > have version 14 at hand. But it looks like both the partial index as > well as the secondary index on (id::text) get cleaned up nicely there. That's great. I understand why other hackers see partial indexes as a special case, but I don't really see them that way. The only substantive difference is the considerations for HOT safety in your scenario, versus a scenario with an equivalent non-partial index. By equivalent I mean an index that is the same in every way, but doesn't have a predicate. And with the same workload. In other words, an index that really should have been partial (because the "extra" index tuples are useless in practice), but for whatever reason wasn't defined that way. If you look at what's going on at the level of the constantly modified leaf pages in each scenario, then you'll see no differences -- none at all. The problem of VACUUM running infrequently is really no worse with the partial index. VACUUM runs infrequently relative to the small useful working set in *either* scenario. The useless extra index tuples in the non-partial-index scenario only *hide* the problem -- obviously they're not protective in any way. > I even tried a version where I have a snapshot open for the entire > run, and the subsequents SELECTs clean the bloat up. I'll need to > read up on the details a bit to understand exactly what changed, but > it appears that at least this particular pattern has already been > fixed. Bottom-up index deletion tends to help even when a snapshot holds back cleanup. For example: https://www.postgresql.org/message-id/CAGnEbogATZS1mWMVX8FzZHMXzuDEcb10AnVwwhCtXtiBpg3XLQ@mail.gmail.com It's hard to explain exactly why this happens. The short version is that there is a synergy between deduplication and bottom-up index deletion. As bottom-up index deletion starts to fail (because it fundamentally isn't possible to delete any more index tuples on the page due to the basic invariants for cleanup not allowing it), deduplication "takes over for the page". Deduplication can "absorb" extra versions from non-hot updates. A deduplication pass could easily buy us enough time for the old snapshot to naturally go away. Next time around a bottom-up index deletion pass is attempted for the same page, we'll probably find something to delete. Just accepting version-driven page splits was always a permanent solution to a temporary problem. > Thank you so much for your work on this! Thanks Marko! -- Peter Geoghegan