On Tue, May 1, 2012 at 11:42 AM, Andres Freund <andres@anarazel.de> wrote:
>> > There is the question whether this should be done in the background
>> > though, so the relation extension lock is never hit in anything
>> > time-critical...
>> Yeah, although I'm fuzzy on how and whether that can be made to work,
>> which is not to say that it can't.
> The biggest problem I see is knowing when to trigger the extension of which
> file without scanning files all the time.
>
> Using some limited size shm-queue of {reltblspc, relfilenode} of to-be-
> extended files + a latch is the first thing I can think of.
Perhaps. An in-memory cache of file sizes would also let us eliminate
a pile of lseek system calls, but the trick is that the
synchronization can't be anything simple like a spinlock - Linux has
does it that way in the kernel in versions <= 3.2 and it's
catastrophic on short read-only transactions.
I think the first thing we need here is a good test case, so we're
clear on what we're trying to solve. I was just hoping to make file
extension *faster* and what you and Simon are talking about is making
it scale better in the face of heavy parallelism; obviously it would
be nice to do both things, but they are different problems. Any old
bulk-loading test will benefit from a raw performance improvement, but
to test a scalability improvement we would need some kind of test case
involving parallel bulk loads, or some other kind of parallel activity
that causes rapid table growth. That's not something I've frequently
run into, but I'd be willing to put a bit of time into it if we can
nail down what we're talking about.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company