Re: TRUNCATE SERIALIZABLE and frozen COPY - Mailing list pgsql-hackers

From Robert Haas
Subject Re: TRUNCATE SERIALIZABLE and frozen COPY
Date
Msg-id CA+TgmoYh_KXErp9eOejMV6EJJaczeZZcSj3kRtq=yg1SjiMidg@mail.gmail.com
Whole thread Raw
In response to Re: TRUNCATE SERIALIZABLE and frozen COPY  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: TRUNCATE SERIALIZABLE and frozen COPY  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Fri, Nov 9, 2012 at 11:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> My goal is to allow COPY to load frozen tuples without causing MVCC violations.
>
> If that's the goal, I question why you're insisting on touching
> TRUNCATE's behavior.  We already have the principle that "TRUNCATE is
> like DELETE except not concurrent-safe".  Why not just invent a
> non-concurrent-safe option to COPY that loads prefrozen tuples into a
> new heap, and call it good?  There will be visibility oddness from that
> definition, sure, but AFAICS there will be visibility oddness from what
> you're talking about too.  You'll just have expended a very great deal
> of effort to make the weirdness a bit different.  Even if the TRUNCATE
> part of it were perfectly clean, the "load prefrozen tuples" part won't
> be --- so I'm not seeing the value of changing TRUNCATE.

I don't object to the idea of giving COPY a way to load prefrozen
tuples, but I think you might be missing the point here otherwise.
Right now, if you CREATE or TRUNCATE a table, copy a bunch of data
into it, and then commit, another transaction that took a snapshot
before your commit can subsequently look at that table and it will NOT
see your newly-loaded data.  What it will see instead is an empty
table.  This is, of course, wrong: it ought to fail with a
serialization error.  It is very possible that the table has never
been empty at the conclusion of a completed transaction: it might have
contained data before the TRUNCATE, and it might again contain data by
the time the truncating transaction commits.  Yet, we see it as empty,
which is not MVCC-compliant.

If we were to make COPY pre-freeze the data when the table was created
or truncated in the same transaction, it would alter the behavior in
this situation, and from an application perspective, only this
situation.  Now, instead of seeing the table as empty, you'd see the
new contents.  This is also not MVCC-compliant, and I guess the
concern when we have talked about this topic before is that changing
from wrong behavior to another, not-backward-compatible wrong behavior
might not be the friendliest thing to do.  We could decide we don't
care and just break it.  Or we could try to make it through a
serialization error, as Simon is proposing here, which seems like the
tidiest solution.  Or we could keep the old heap around until there
are no more snapshots that can need it, which is a bit scary since
we'd be eating double disk-space in the meantime, but it would
certainly be useful to some users, I think.

Just having an option to preload frozen tuples dodges all of these
issues by throwing our hands up in the air, but it does have the
advantage of being more general.  Even if we do that I'm not sure it
would be a bad thing to try to solve this issue in a somewhat more
principled way, but it would surely reduce the urgency.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: TRUNCATE SERIALIZABLE and frozen COPY
Next
From: Josh Berkus
Date:
Subject: Re: Enabling Checksums