Re: Heap truncation without AccessExclusiveLock (9.4) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Heap truncation without AccessExclusiveLock (9.4)
Date
Msg-id CA+TgmoawriP6i4BzdVp-TBRqHmiuLA0cctORbFcQsshFQn72Fg@mail.gmail.com
Whole thread Raw
In response to Re: Heap truncation without AccessExclusiveLock (9.4)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Heap truncation without AccessExclusiveLock (9.4)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, May 15, 2013 at 7:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Another problem is that sinval resets are bad for performance, and
>> anything we do that pushes more messages through sinval will increase
>> the frequency of resets.
>
> I've been thinking that we should increase the size of the sinval ring;
> now that we're out from under SysV shmem size limits, it wouldn't be
> especially painful to do that.  That's not terribly relevant to this
> issue though.  I agree that we don't want an sinval message per relation
> extension, no matter what the ring size is.

I've been thinking for a while that we need some other system for
managing other kinds of invalidations.  For example, suppose we want
to cache relation sizes in blocks.  So we allocate 4kB of shared
memory, interpreted as an array of 512 8-byte entries.  Whenever you
extend a relation, you hash the relfilenode and take the low-order 9
bits of the hash value as an index into the array.  You increment that
value either under a spinlock or perhaps using fetch-and-add where
available.

On the read side, every backend can cache the length of as many
relations as it wants.  But before relying on a cached value, it must
index into the shared array and see whether the value has been
updated.  On 64-bit systems, this requires no lock, only a barrier,
and some 32-bit systems have special instructions that can be used for
an 8-byte atomic read, and hence could avoid the lock as well.  This
would almost certainly be cheaper than doing an lseek every time,
although maybe not by enough to matter.  At least on Linux, the
syscall seems to be pretty cheap.

Now, a problem with this is that we keep doing things that make it
hard for people to run very low memory instances of PostgreSQL.  So
potentially whether or not we allocate space for this could be
controlled by a GUC.  Or maybe the structure could be made somewhat
larger and shared among multiple caching needs.

I'm not sure whether this idea can be adapted to do what Heikki is
after.  But I think these kinds of techniques are worth thinking about
as we look for ways to further improve performance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Heap truncation without AccessExclusiveLock (9.4)
Next
From: Dev Kumkar
Date:
Subject: Re: "on existing update" construct