clog double-dip in heap_hot_search_buffer - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | clog double-dip in heap_hot_search_buffer |
Date | |
Msg-id | CA+TgmobwhcHrYmiH3=8uvztZYK=TCRytiYO7rX+hYdtU4dz8vg@mail.gmail.com Whole thread Raw |
Responses |
Re: clog double-dip in heap_hot_search_buffer
|
List | pgsql-hackers |
heap_hot_search_buffer() does this: valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer); If it turns out that the tuple isn't valid (i.e. visible to our scan) and we haven't yet found any live tuples in the current HOT chain, then we check whether it's visible to anyone at all: if (all_dead && *all_dead && HeapTupleSatisfiesVacuum(heapTuple->t_data, RecentGlobalXmin, buffer) != HEAPTUPLE_DEAD) *all_dead = false; This is obviously an important optimization for accelerating index cleanup, but it has an unfortunate side-effect: it considerably increases the frequency of CLOG access. Normally, HeapTupleSatisfiesVisibility() will sent hint bits on the tuple, but sometimes it can't, either because the inserter has not yet committed or the inserter's commit record hasn't been flushed or the deleter hasn't committed or the deleter's commit record hasn't been flushed. When that happens, HeapTupleSatisfiesVacuum() gets called a moment later and repeats the same CLOG lookups. It is of course possible for a state change to happen in the interim, but that's not really a reason to repeat the lookups; asking the same question twice in a row just in case you should happen to get an answer you like better the second time is not generally a good practice, even if it occasionally works. The attached patch adds a new function HeapTupleIsSurelyDead(), a cut-down version of HeapTupleSatisfiesVacuum(). It assumes that, first, we only care about distinguishing between dead and anything else, and, second, that any transaction for which hint bits aren't yet set is still running. This allows it to be a whole lot simpler than HeapTupleSatisfiesVacuum() and to get away without doing any CLOG access. It also changes heap_hot_search_buffer() to use this new function in lieu of HeapTupleSatisfiesVacuum(). I found this problem by using 'perf record -e cs -g' and 'perf report -g' to find out where context switches were happening. It turns out that this is a very significant contributor to CLOG-related context switches. Retesting with those same tools shows that the patch does in fact make those context switches go away. On a long pgbench test, the effects of WALInsertLock contention, ProcArrayLock contention, checkpoint-related latency, etc. will probably swamp the effect of the patch. On a short test, however, the effects are visible; and in general anything that optimizes away access to heavily contended shared memory data structures is probably a good thing. Permanent tables, scale factor 100, 30-second tests: master: tps = 22175.025992 (including connections establishing) tps = 22072.166338 (including connections establishing) tps = 22653.876341 (including connections establishing) with patch: tps = 26586.623556 (including connections establishing) tps = 25564.098898 (including connections establishing) tps = 25756.036647 (including connections establishing) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: