Re: Using multi-row technique with COPY - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Using multi-row technique with COPY
Date
Msg-id 1133299823.2906.437.camel@localhost.localdomain
Whole thread Raw
In response to Re: Using multi-row technique with COPY  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Using multi-row technique with COPY
List pgsql-hackers
On Tue, 2005-11-29 at 15:58 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > Tom Lane wrote:
> >> Log, yes, unless it's a temp table.  The point is we could avoid taking
> >> buffer content locks.  Come to think of it, we could implement that
> >> trivially in the heapam.c routines; it would then apply to any table
> >> update whether generated by COPY or otherwise.
> 
> > We should do that, yes, we've discussed that before.
> 
> Not that I recall ... 

It was a quick comment of yours only, not a long discussion. Its a good
idea and we should do it, but its not everything I want (other posts
pursue the other ideas)

> anyway, I had second thoughts about this while
> eating lunch.  A buffer for a new table can reasonably be assumed not to
> be accessed by any other backend, but we can *not* make that assumption
> for the bgwriter.  This means that the bgwriter might come along and
> dump the buffer to disk while we are in the midst of scribbling on it.
> Since we'll mark the thing dirty after we finish scribbling, no harm
> is done --- unless there is a crash after we commit and before the next
> write of the buffer occurs.  In that case we have a corrupt copy of the
> buffer on disk, which is going to pose a problem for recovery.  (As an
> example, the corrupt copy might have an invalid LSN that is larger than
> it really should be, which will cause us to ignore the WAL record that
> might have otherwise let us fix the problem.)
> 
> As far as I can see at the moment, such a problem will be repaired
> during recovery *as long as you have the full-page-writes option on*.
> Otherwise not.  So we'd probably have to restrict the optimization to
> happen only if that's on, as well as only for new tables.

Not sure I understand that. If there's a crash then the transaction
failed so any data on the partially written block is invisible. Why else
would the block become corrupt? 

Whether full page writes is on or not, we still fill the block. After
which we never log the full page because the block doesn't change again.

Should we checkpoint half way through filling the block, then its still
written to disk whether we do full page writes or not.

Maybe its just late here...

> >> Exactly my point; we don't have to change any syntax, so pg_dump
> >> doesn't care.
> 
> > Does pg_dump wrap the CREATE TABLE and COPY into the same transaction?
> 
> No, but you can manually put BEGIN and END around the whole script.
> Some people do that anyway.  (It wouldn't be a bad idea to add an option
> to pg_restore to do it, since there's no very convenient way to add
> begin/end manually when piping pg_restore right into a database.)

Can do.

Best Regards, Simon Riggs



pgsql-hackers by date:

Previous
From: Gavin Sherry
Date:
Subject: Re: ice-broker scan thread
Next
From: Gavin Sherry
Date:
Subject: Re: ice-broker scan thread