The attached proof of concept patch greatly improves the bloat for both the insert and the update cases. You need to turn on both features: adding the pages to fsm, and vacuuming the fsm, to get the benefit (so JJ_GIN=3). The first of those two things could probably be adopted for real, but the second probably is not acceptable. What is the right way to do this? Could a variant of RecordFreeIndexPage bubble the free space up `the map immediately rather than waiting for a vacuum? It would only have to move up until it found a page with freespace already recorded in it, which the vast majority of the time would mean observing up one level and then not writing to it, assuming the pending list pages remain well clustered.
You make a good case for action here since insert only tables with GIN indexes on text are a common use case for GIN.
Why would vacuuming the FSM be unacceptable? With a large gin_pending_list_limit it makes sense.
But with a smallish gin_pending_list_limit (like the default 4MB) this could be called a lot (multiple times a second during some spurts), and would read the entire fsm each time.
If it is unacceptable, perhaps we can avoid calling it every time, or simply have FreeSpaceMapVacuum() terminate more quickly on some kind of 80/20 heuristic for this case.
Or maybe it could be passed a range of blocks which need vacuuming, so it concentrated on that range.
But from the README file, it sounds like it is already supposed to be bubbling up. I'll have to see just whats going on there when I get a chance.
Before making changes to the FSM code to make immediate summarization possible, I decided to quantify the effect of vacuuming the entire fsm. Going up to 5 GB of index size, the time taken to vacuum the entire FSM one time for each GIN_NDELETE_AT_ONCE was undetectable.
Based on that, I made this patch which vacuums it one time per completed ginInsertCleanup, which should be far less than once per GIN_NDELETE_AT_ONCE.
I would be interested in hearing what people with very large GIN indexes think of it. It does seem like at some point the time needed must become large, but from what I can tell that point is way beyond what someone is likely to have for an index on an unpartitioned table.
I have a simple test case that inserts an array of 101 md5 digests into each row. With 10_000 of these rows inserted into an already indexed table, I get 40MB for the table and 80MB for the index unpatched. With the patch, I get 7.3 MB for the index.