Re: Deleting older versions in unique indexes to avoid page splits - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Deleting older versions in unique indexes to avoid page splits |
Date | |
Msg-id | CAH2-Wzm46rS1JDmRO0fbnQDpYyD2PrX2cYsa8-5=nhp7nT_w9A@mail.gmail.com Whole thread Raw |
In response to | Re: Deleting older versions in unique indexes to avoid page splits (Victor Yegorov <vyegorov@gmail.com>) |
Responses |
Re: Deleting older versions in unique indexes to avoid page splits
(Peter Geoghegan <pg@bowt.ie>)
Re: Deleting older versions in unique indexes to avoid page splits (Victor Yegorov <vyegorov@gmail.com>) Re: Deleting older versions in unique indexes to avoid page splits (Victor Yegorov <vyegorov@gmail.com>) |
List | pgsql-hackers |
On Wed, Nov 11, 2020 at 12:58 PM Victor Yegorov <vyegorov@gmail.com> wrote: > On the other hand, there's quite a big drop on the UPDATEs throughput. For sure, undersized shared_bufefrs > contribute to this drop. Still, my experience tells me that under conditions at hand (disabled HOT due to index > over update time column) tables will tend to accumulate bloat and produce unnecessary IO also from WAL. I think that the big SELECT statement with an "ORDER BY mtime ... " was a good way of demonstrating the advantages of the patch. Attached is v8, which has the enhancements for low cardinality data that I mentioned earlier today. It also simplifies the logic for dealing with posting lists that we need to delete some TIDs from. These posting list simplifications also make the code a bit more efficient, which might be noticeable during benchmarking. Perhaps your "we have 5,2% slowdown in UPDATE speed" issue will be at least somewhat fixed by the enhancements to v8? Another consideration when testing the patch is the behavioral differences we see between cases where system throughput is as high as possible, versus similar cases where we have a limit in place (i.e. pgbench --rate=? was used). These two cases are quite different with the patch because the patch can no longer be lazy without a limit -- we tend to see noticeably more CPU cycles spent doing bottom-up deletion, though everything else about the profile looks similar (I generally use "perf top" to keep an eye on these things). It's possible to sometimes see increases in latency (regressions) when running without a limit, at least in the short term. These increases can go away when a rate limit is imposed that is perhaps as high as 50% of the max TPS. In general, I think that it makes sense to focus on latency when we have some kind of limit in place. A non-rate-limited case is less realistic. > Perhaps I need to conduct a longer test session, say 8+ hours to make obstacles appear more like in real life. That would be ideal. It is inconvenient to run longer benchmarks, but it's an important part of performance validation. BTW, another related factor is that the patch takes longer to "warm up". I notice this "warm-up" effect on the second or subsequent runs, where we have lots of garbage in indexes even with the patch, and even in the first 5 seconds of execution. The extra I/Os for heap page accesses end up being buffer misses instead of buffer hits, until the cache warms. This is not really a problem with fewer longer runs, because there is relatively little "special warm-up time". (We rarely experience heap page misses during ordinary execution because the heapam.c code is smart about locality of access.) I noticed that the pgbench_accounts_pkey index doesn't grow at all on the master branch in 20201111-results-master.txt. But it's always just a matter of time until that happens without the patch. The PK/unique index takes longer to bloat because it alone benefits from LP_DEAD setting, especially within _bt_check_unique(). But this advantage will naturally erode over time. It'll probably take a couple of hours or more with larger scale factors -- I'm thinking of pgbench scale factors over 2000. When the LP_DEAD bit setting isn't very effective (say it's 50% effective), it's only a matter of time until every original page splits. But that's also true when LP_DEAD setting is 99% effective. While it is much less likely that any individual page will split when LP_DEAD bits are almost always set, the fundamental problem remains, even with 99% effectiveness. That problem is: each page only has to be unlucky once. On a long enough timeline, the difference between 50% effective and 99% effective may be very small. And "long enough timeline" may not actually be very long, at least to a human. Of course, the patch still benefits from LP_DEAD bits getting set by queries -- no change there. It just doesn't *rely* on LP_DEAD bits keeping up with transactions that create bloat on every leaf page. Maybe the patch will behave exactly the same way as the master branch -- it's workload dependent. Actually, it behaves in exactly the same way for about the first 5 - 15 minutes following pgbench initialization. This is roughly how long it takes before the master branch has even one page split. You could say that the patch "makes the first 5 minutes last forever". (Not sure if any of this is obvious to you by now, just saying.) Thanks! -- Peter Geoghegan
Attachment
pgsql-hackers by date: