Re: Should I implement DROP INDEX CONCURRENTLY? - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: Should I implement DROP INDEX CONCURRENTLY?
Date
Msg-id CA994BD8-9A07-40EB-A3CA-7C4A0A0B0E67@nasby.net
Whole thread Raw
In response to Re: Should I implement DROP INDEX CONCURRENTLY?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Should I implement DROP INDEX CONCURRENTLY?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Jan 3, 2012, at 5:28 PM, Tom Lane wrote:
> Jim Nasby <jim@nasby.net> writes:
>> On Jan 3, 2012, at 12:11 PM, Simon Riggs wrote:
>>> This could well be related to the fact that DropRelFileNodeBuffers()
>>> does a scan of shared_buffers, which is an O(N) approach no matter the
>>> size of the index.
>
>> Couldn't we just leave the buffers alone? Once an index is dropped and that's pushed out through the catalog then
nothingshould be trying to access them and they'll eventually just get aged out. 
>
> No, we can't, because if they're still dirty then the bgwriter would
> first try to write them to the no-longer-existing storage file.  It's
> important that we kill the buffers immediately during relation drop.
>
> I'm still thinking that it might be sufficient to mark the buffers
> invalid and let the clock sweep find them, thereby eliminating the need
> for a freelist.  Simon is after a different solution involving getting
> rid of the clock sweep, but he has failed to explain how that's not
> going to end up being the same type of contention-prone coding that we
> got rid of by adopting the clock sweep, some years ago.  Yeah, the sweep
> takes a lot of spinlocks, but that only matters if there is contention
> for them, and the sweep approach avoids the need for a centralized data
> structure.

Yeah, but the problem we run into is that with every backend trying to run the clock on it's own we end up with high
contentionagain... it's just in a different place than when we had a true LRU. The clock sweep might be cheaper than
thelinked list was, but it's still awfully expensive. I believe our best bet is to have a free list that is actually
usefulin normal operations, and then optimize the cost of pulling buffers out of that list as much as possible (and let
thebgwriter deal with keeping enough pages in that list to satisfy demand). 

Heh, it occurs to me that the SQL analogy for how things work right now is that backends currently have to run a
SeqScan(or 5) to find a free page... what we need to do is CREATE INDEX free ON buffers(buffer_id) WHERE count = 0;. 

> (BTW, do we have a separate clock sweep hand for each backend?  If not,
> there might be some low hanging fruit there.)

No... having multiple clock hands is an interesting idea, but I'm worried that it could potentially get us into trouble
ifscores of backends were suddenly decrementing usage counts all over the place. For example, what if 5 backends all
hadtheir hands in basically the same place, all pointing at a very heavily used buffer. All 5 backends go for free
space,they each grab the spinlock on that buffer in succession and suddenly this highly used buffer that started with a
countof 5 has now been freed. We could potentially use more than one hand, but I think the relation between the number
ofhands and the maximum usage count has to be tightly controlled. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [patch] Improve documentation around FreeBSD Kernel Tuning
Next
From: Jim Nasby
Date:
Subject: Re: information schema/aclexplode doesn't know about default privileges