Re: extending relations more efficiently - Mailing list pgsql-hackers

From Robert Haas
Subject Re: extending relations more efficiently
Date
Msg-id CA+TgmobrJwyvOwztdK0mGkzw1wNv1Zqo3ykW7YpkuXHDNvtCwA@mail.gmail.com
Whole thread Raw
In response to Re: extending relations more efficiently  (Andres Freund <andres@anarazel.de>)
Responses Re: extending relations more efficiently
Re: extending relations more efficiently
List pgsql-hackers
On Tue, May 1, 2012 at 10:31 AM, Andres Freund <andres@anarazel.de> wrote:
>> efficient than our current method - I'm guessing that it actually
>> writes the updated metadata back to disk, where write() does not (this
>> makes one wonder how safe it is to count on write to have the behavior
>> we need here in the first place).
> Currently the write() doesn't need to be crashsafe because it will be repeated
> on crash-recovery and a checkpoint will fsync the file.

That's not what I'm worried about.  If the write() succeeds and then a
subsequent close() on the filehandle reports an ENOSPC condition that
means the write didn't really write after all, I am concerned that we
might not handle that cleanly.

> I don't really see why it would need to compare in the 8kb case. What reason
> would there be to further extend in that small increments?

In previous discussions, the concern has been that holding the
relation extension lock across a multi-block extension would cause
latency spikes for both the process doing the extensions and any other
concurrent processes that need the lock.  Obviously if it were
possible to extend by 64kB in the same time it takes to extend by 8kB
that would be awesome, but if it takes eight times longer then things
don't look so good.

> There is the question whether this should be done in the background though, so
> the relation extension lock is never hit in anything time-critical...

Yeah, although I'm fuzzy on how and whether that can be made to work,
which is not to say that it can't.

It might also be interesting to provide a mechanism to pre-extend a
relation to a certain number of blocks, though if we did that we'd
have to make sure that autovac got the memo not to truncate those
pages away again.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: JSON in 9.2 - Could we have just one to_json() function instead of two separate versions ?
Next
From: Peter Geoghegan
Date:
Subject: Re: proposal: additional error fields