Re: Vacuum: allow usage of more than 1GB of work mem - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Vacuum: allow usage of more than 1GB of work mem
Date
Msg-id 7157.1473184844@sss.pgh.pa.us
Whole thread Raw
In response to Re: Vacuum: allow usage of more than 1GB of work mem  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Vacuum: allow usage of more than 1GB of work mem  (Simon Riggs <simon@2ndquadrant.com>)
Re: Vacuum: allow usage of more than 1GB of work mem  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> Yeah, but I've seen actual breakage from exactly this issue on
> customer systems even with the 1GB limit, and when we start allowing
> 100GB it's going to get a whole lot worse.

While it's not necessarily a bad idea to consider these things,
I think people are greatly overestimating the consequences of the
patch-as-proposed.  AFAICS, it does *not* let you tell VACUUM to
eat 100GB of workspace.  Note the line right in front of the one
being changed:
        maxtuples = (vac_work_mem * 1024L) / sizeof(ItemPointerData);        maxtuples = Min(maxtuples, INT_MAX);
-        maxtuples = Min(maxtuples, MaxAllocSize / sizeof(ItemPointerData));
+        maxtuples = Min(maxtuples, MaxAllocHugeSize / sizeof(ItemPointerData));

Regardless of what vac_work_mem is, we aren't gonna let you have more
than INT_MAX ItemPointers, hence 12GB at the most.  So the worst-case
increase from the patch as given is 12X.  Maybe that's enough to cause
bad consequences on some systems, but it's not the sort of disaster
Robert posits above.

It's also worth re-reading the lines just after this, which constrain
the allocation a whole lot more for small tables.  Robert comments:

> ...  But VACUUM will very happily allocate
> vastly more memory than the number of dead tuples.  It is thankfully
> smart enough not to allocate more storage than the number of line
> pointers that could theoretically exist in a relation of the given
> size, but that only helps for very small relations.  In a large
> relation that divergence between the amount of storage space that
> could theoretically be needed and the amount that is actually needed
> is likely to be extremely high.  1 TB relation = 2^27 blocks, each of
> which can contain MaxHeapTuplesPerPage dead line pointers.  On my
> system, MaxHeapTuplesPerPage is 291, so that's 291 * 2^27 possible
> dead line pointers, which at 6 bytes each is 291 * 6 * 2^27 = ~218GB,
> but the expected number of dead line pointers is much less than that.

If we think the expected number of dead pointers is so much less than
that, why don't we just decrease LAZY_ALLOC_TUPLES, and take a hit in
extra index vacuum cycles when we're wrong?

(Actually, what I'd be inclined to do is let it have MaxHeapTuplesPerPage
slots per page up till a few meg, and then start tailing off the
space-per-page, figuring that the law of large numbers will probably kick
in.)
        regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Next
From: Simon Riggs
Date:
Subject: Re: Vacuum: allow usage of more than 1GB of work mem