Poorly thought out code in vacuum - Mailing list pgsql-hackers

From Tom Lane
Subject Poorly thought out code in vacuum
Date
Msg-id 25422.1325810273@sss.pgh.pa.us
Whole thread Raw
Responses Re: Poorly thought out code in vacuum  (Simon Riggs <simon@2ndquadrant.com>)
Re: Poorly thought out code in vacuum  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Lately I have noticed buildfarm member jaguar occasionally failing
regression tests with
ERROR:  invalid memory alloc request size 1073741824
during a VACUUM, as for example at
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguar&dt=2012-01-04%2023%3A05%3A02

Naturally I supposed it to be a consequence of the CLOBBER_CACHE_ALWAYS
option, ie, somebody accessing an already-freed cache entry.  I managed
to duplicate and debug it here after an afternoon of trying, and it's
not actually related to that at all, except perhaps to the extent that
CLOBBER_CACHE_ALWAYS slows down some things enormously more than others.
The failure comes when a ResourceOwner tries to enlarge its
pinned-buffers array to more than 1GB, and that's not a typo: there are
umpteen entries for the same buffer, whose PrivateRefCount is 134217727
and trying to go higher.  A stack trace soon revealed the culprit, which
is this change in lazy_vacuum_heap():

@@ -932,7 +965,8 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)        tblk =
ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);       buf = ReadBufferExtended(onerel, MAIN_FORKNUM,
tblk,RBM_NORMAL,                                 vac_strategy);
 
-        LockBufferForCleanup(buf);
+        if (!ConditionalLockBufferForCleanup(buf))
+            continue;        tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats);        /* Now that
we'vecompacted the page, record its available space */
 

introduced in
http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=bbb6e559c4ea0fb4c346beda76736451dc24eb4e

The "continue" just results in another iteration of the containing tight
loop, and of course that immediately re-pins the same buffer without
having un-pinned it.  So if someone else sits on their pin of the buffer
for long enough, kaboom.

We could fix the direct symptom by inserting UnlockReleaseBuffer()
in front of the continue, but AFAICS that just makes this into a
busy-waiting equivalent of waiting unconditionally, so I don't really
see why we should not revert the above fragment altogether.  However,
I suppose Robert had something more intelligent in mind than a tight
loop when the buffer can't be exclusively locked, so maybe there is
some other change that should be made here instead.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: random_page_cost vs seq_page_cost
Next
From: Stephen Frost
Date:
Subject: Re: 16-bit page checksums for 9.2