Re: [BUGS] Breakage with VACUUM ANALYSE + partitions - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [BUGS] Breakage with VACUUM ANALYSE + partitions
Date
Msg-id CA+TgmoaX35ui4DVN_74YyxS_gtfMXPo4pkru-AAkBoSCh9FLuA@mail.gmail.com
Whole thread Raw
In response to Re: [BUGS] Breakage with VACUUM ANALYSE + partitions  (Andres Freund <andres@anarazel.de>)
Responses Re: [BUGS] Breakage with VACUUM ANALYSE + partitions  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Mon, Apr 11, 2016 at 12:50 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2016-04-04 12:45:25 -0400, Robert Haas wrote:
>> Well, I agree that it's pretty strange that _mdfd_getseg() makes no
>> such check, but I still don't think I understand what's going on here.
>> Backends shouldn't be requesting nonexistent blocks from a relation -
>> higher-level safeguards, like holding AccessExclusiveLock before
>> trying to complete a DROP or TRUNCATE - are supposed to prevent that.
>
> I don't think that's really true: We write blocks back without holding
> any sort of relation level locks; and thus do _mdfd_getseg() type
> accesses as well.

You're right, but I think that's more because I didn't say it
correctly than because you haven't done something novel.  DROP and
relation truncation know about shared buffers, and they go clear
blocks that that might be affected from it as part of the truncate
operation, which means that no other backend will see them after they
are gone.  The lock makes sure that no other references can be added
while we're busy removing any that are already there.  So I think that
there is currently an invariant that any block we are attempting to
access should actually still exist.  It sounds like these references
are sticking around in backend-private memory, which means they are
neither protected by locks nor able to be cleared out on drop or
truncate.  I think that's a new thing, and a bit scary.

> And we're not really "requesting nonexistant blocks" - the segments are
> just opened to get the associated file descriptor, and they're opened
> with EXTENSION_RETURN_NULL. It turns out to be important
> performance-wise to reuse fd's when triggering kernel writeback.

The possibly-saving grace here, I suppose, is that the references
we're worried about are just being used to issue hints to the
operating system.  So I guess if we sent a hint on a wrong block or
skip sending a hint altogether because of some failure, no harm done,
as long as we don't error out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [BUGS] Breakage with VACUUM ANALYSE + partitions
Next
From: "Karl O. Pinc"
Date:
Subject: Re: PQsendQuery+PQgetResult+PQsetSingleRowMode limitations and support