Re: Benchmark Data requested - Mailing list pgsql-performance

From Matthew
Subject Re: Benchmark Data requested
Date
Msg-id Pine.LNX.4.64.0802051612340.20402@aragorn.flymine.org
Whole thread Raw
In response to Re: Benchmark Data requested  (Richard Huxton <dev@archonet.com>)
List pgsql-performance
On Tue, 5 Feb 2008, Richard Huxton wrote:
>> So what's wrong with "reserving" the space using the WAL, then everyone
>> else will know. After all, when you write the data to the WAL, you must
>> have an idea of where it is meant to end up. My suggestion is that you go
>> through all the motions of writing the data to the WAL, just without the
>> data bit.
>
> Well, now you're looking at page-level locking for the data blocks, or at
> least something very similar. Not sure what you'd do with indexes though -
> don't see a simple way of avoiding a large lock on a btree index.

Yeah, indexes would be a lot more difficult I guess, if writes to them
involve changing lots of stuff around. We do most of our loads without the
indexes present though.

> If you reserved the space in advance that could work. But you don't know how
> much to reserve until you've copied it in.

What does the WAL do? When do you allocate space in the file for written
rows? Is is when you write the WAL, or when you checkpoint it? If it's
when you write the WAL, then you can just use the same algorithm.

> You could of course have a set of co-operating processes all bulk-loading
> while maintaining a table-lock outside of the those. It feels like things are
> getting complicated then though.

That does sound a bit evil.

You could have different backends, each running a single transaction where
they create one table and load the data for it. That wouldn't need any
change to the backend, but it would only work for dump restores, and would
require the client to be clever. I'm all for allowing this kind of
optimisation while writing normally to the database, and for not requiring
the client to think too hard.

Matthew

--
All of this sounds mildly turgid and messy and confusing... but what the
heck. That's what programming's all about, really
                                        -- Computer Science Lecturer

pgsql-performance by date:

Previous
From: "Jignesh K. Shah"
Date:
Subject: Re: Benchmark Data requested
Next
From: Dimitri Fontaine
Date:
Subject: Re: Benchmark Data requested