Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise
Date
Msg-id CAH2-Wz=U6X_xpgREWP5Hu-m3nrDhLa7rHhmzKHoHuLSu8bJpSA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise
List pgsql-hackers
On Wed, Jul 19, 2017 at 3:54 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
>> Index bloat is a general problem that B-Trees have in all other major
>> systems, but I think that PostgreSQL has a tendency to allow indexes
>> to become progressively more bloated over time, in a way that it often
>> can never recover from [1].
>
> Interesting assertion.

I don't pretend to understand the complicated feedback loops that may
exist for workloads that are very reliant on pruning,
kill_prior_tuples cleanup, very aggressive vacuuming, etc. I'll just
say that it seems very reasonable to suppose that they exist.

My argument for the importance of index bloat to the more general
bloat problem is simple: any bloat that accumulates, that cannot be
cleaned up, will probably accumulate until it impacts performance
quite noticeably.

Index bloat may not seem as important as I suggest to some. The ideas
I'm talking about were received somewhat skeptically at pgCon, when I
brought them up informally. Maybe this is a more cogent explanation:
if your indexes were somehow magically never bloated, but the heap
could become just as bloated, then it might not matter that much
simply because the heap pages wouldn't be accessed by index scans.
Heap page accesses may demonstrably be the bottleneck today, without
that magic in place, but perhaps only because indexes point to the
bloat in the first place. It could be a bit like the situation with
bloating UNDO segments in Oracle; those are not the worst thing to
have to bloat. And, the kill_prior_tuples stuff is only as strong as
the weakest link (oldest snapshot).

> Many years ago I set to implement btree page
> merging[1] from ideas in a 1996 paper[2], though that work never saw the
> light of day.  Maybe it can be valuable now.

Lots of other database systems don't implement B-Tree page merging,
because it's hard to make work with techniques like Lehman & Yao
B-Link trees, and the average case space utilization still ends up
being good enough. It may be more worthwhile for us, though.
Apparently Kevin has some ideas here.

If I'm right about this index bloat issue, then I'd sooner tackle the
problem by preventing bloat in unique indexes in the fist place, by
using some additional indirection, a topic that I've said plenty about
recently. I think that you can sometimes, though fairly rarely, see
cases that get *really* out of hand, but with things like in-database
queues, that have unusually aggressive update patterns where a great
many duplicates are generated in indexes [1]. Our handling of
duplicates in unique indexes [2] is surely a problem there.

> Another thing to consider is indirect indexes, wherein you get less heap
> bloat because more updates can become HOT.

I think that the stuff I'm talking about, about having indirection for
the primary key and making sure unique indexes actually disallow even
physical duplicates actually enables indirect indexes. Remember how I
pointed out issues with unique indexes and VACUUM when you presented
on it at the pgCon unconference? I think that those problems may be
solvable through centralizing everything by making duplicates within
leaf pages physically impossible for unique indexes (not just
logically impossible). One chokepoint for dealing with bloat cleanup
for every index is the entire point of your indirect index design, but
that requires that you actually have no ambiguity about what every
logical pointer (in every secondary index) points to.

> If we also allow heap to be pruned from line pointers by walking indexes
> to remove specific pointers, instead of requiring indexes to be scanned
> whole for the removal, as proposed by Andres, we could get further
> along.

Yeah. I talked to Robert about doing that just today. That seems very
worthwhile, no matter how it ends up being used (whether it's by
VACUUM, something closer to an synchronous deletion, or whatever).

[1] https://brandur.org/postgres-queues
[2] https://wiki.postgresql.org/wiki/Key_normalization#Avoiding_unnecessary_unique_index_enforcement
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] psql's \r broken since e984ef5861d
Next
From: Etsuro Fujita
Date:
Subject: Re: [HACKERS] Bug in ExecModifyTable function and trigger issues forforeign tables