> > I don't see why it is hard. Trying to find the index entry for the old > update row seems odd, and updating an index row seems odd, but skipping > an index addition for the new row seems simple. Is the problem > concurrent activity? I assumed already had that ability to add to HOT > chains because we have to lock the update chain.
Think of an index over a 1TB table whose keys have only 2 values: true and false.
Sure, it's a horrible index. But I've seen things like that, and I've seen cases when they're useful too.
So, conceptually, for each key you have circa N/2 tids on the index. nbtree finds the leftmost valid insert point comparing keys, it doesn't care about tids, so to find the index entries that point to the page where the new tuple is, you'd have to scan the N/2 tids in the index, an extremely expensive operation.
Well, it's always going to be extremely hard to solve for all use cases. This is one such extreme case and we should just give up and do cold update.
I think we can look at the index type (unique vs non-unique) along with table statistics to find what fraction of column values are distinct and then estimate whether its worthwhile to look for duplicate (key, CTID) or just do a cold update. In addition put some cap of how hard we try once we decide to check for duplicates and give up after we cross that threshold.