Re: VACUUM FULL versus system catalog cache invalidation - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: VACUUM FULL versus system catalog cache invalidation
Date
Msg-id 4E457E37.6020706@enterprisedb.com
Whole thread Raw
In response to Re: VACUUM FULL versus system catalog cache invalidation  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: VACUUM FULL versus system catalog cache invalidation
List pgsql-hackers
On 12.08.2011 21:49, Robert Haas wrote:
> On Fri, Aug 12, 2011 at 2:09 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>> 2. Forget about targeting catcache invals by TID, and instead just use the
>> key hash value to determine which cache entries to drop.
>>
>> Approach #2 seems a lot less invasive and more trustworthy, but it has the
>> disadvantage that cache invals would become more likely to blow away
>> entries unnecessarily (because of chance hashvalue collisions), even
>> without any VACUUM FULL being done.  If we could make approach #1 work
>> reliably, it would result in more overhead during VACUUM FULL but less at
>> other times --- or at least we could hope so.  In an environment where
>> lots of sinval overflows and consequent resets happen, we might come out
>> behind due to doubling the number of catcache flushes forced by a reset
>> event.
>>
>> Right at the moment I'm leaning to approach #2.  I wonder if anyone
>> sees it differently, or has an idea for a third approach?
>
> I don't think it really matters whether we occasionally blow away an
> entry unnecessarily due to a hash-value collision.  IIUC, we'd only
> need to worry about hash-value collisions between rows in the same
> catalog; and the number of entries that we have cached had better be
> many orders of magnitude less than 2^32.  If the cache is large enough
> that we're having hash value collisions more than once in a great
> while, we probably should have flushed some entries out of it a whole
> lot sooner and a whole lot more aggressively, because we're likely
> eating memory like crazy.

What would suck, though, is if you have an application that repeatedly 
creates and drops a temporary table, and the hash value for that happens 
to match some other table in the database. catcache invalidation would 
keep flushing the entry for that other table too, and you couldn't do 
anything about it except for renaming one of the tables.

Despite that, +1 for option #2. The risk of collision seems acceptable, 
and the consequence of a collision wouldn't be too bad in most 
applications anyway.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: VACUUM FULL versus system catalog cache invalidation
Next
From: Peter Geoghegan
Date:
Subject: Re: Further news on Clang - spurious warnings