Thread: TRUNCATE SERIALIZABLE and frozen COPY

TRUNCATE SERIALIZABLE and frozen COPY

From
Simon Riggs
Date:
For 9.2 we discussed having COPY setting tuples as frozen. Various
details apply.
Earlier threads: "RFC: Making TRUNCATE more "MVCC-safe" "COPY with hints, rebirth"

I was unhappy with changing the behaviour of TRUNCATE, and still am.
So the proposal here is to have a specific modifier on TRUNCATE
command that makes it MVCC safe by throwing a serialization error.
That new behaviour should be requestable by adding the SERIALIZABLE
keyword.
i.e. TRUNCATE foo SERIALIZABLE;
This then allows a new style of TRUNCATE, yet without modiying
behaviour of earlier programs (ugh!).

(Once we have this, somebody that cares more than me may wish to
discuss deprecation of existing TRUNCATE behaviour in favour of this
new behaviour. I am not suggesting that here, nor do I even see a
reason for that at present.)

As soon as we have TRUNCATE SERIALIZABLE we can then enable normal
COPY to set tuples as frozen when
1) no earlier snapshots exist in the current transaction - since they
might see frozen tuples loaded in a later command and thus cause MVCC
violation
2) table has been truncated in this subtransaction by a TRUNCATE SERIALIZABLE
That then means the optimization would be available for normal data
loads/reloads.

This looks like a fairly neat way to allow MVCC-aware TRUNCATE for
those that want it, without affecting existing code and yet speeding
up large loads.

(Note that I am not suggesting any change to existing CREATE TABLE
behaviour. If you want this optimization, just add in TRUNCATE
SERIALIZABLE).

Thoughts?

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Robert Haas
Date:
On Wed, Nov 7, 2012 at 10:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> For 9.2 we discussed having COPY setting tuples as frozen. Various
> details apply.
> Earlier threads:
>   "RFC: Making TRUNCATE more "MVCC-safe"
>   "COPY with hints, rebirth"
>
> I was unhappy with changing the behaviour of TRUNCATE, and still am.
> So the proposal here is to have a specific modifier on TRUNCATE
> command that makes it MVCC safe by throwing a serialization error.

I don't think I understand the proposal.  Under what circumstances
would it throw a serialization error?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Simon Riggs
Date:
On 8 November 2012 17:07, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Nov 7, 2012 at 10:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> For 9.2 we discussed having COPY setting tuples as frozen. Various
>> details apply.
>> Earlier threads:
>>   "RFC: Making TRUNCATE more "MVCC-safe"
>>   "COPY with hints, rebirth"
>>
>> I was unhappy with changing the behaviour of TRUNCATE, and still am.
>> So the proposal here is to have a specific modifier on TRUNCATE
>> command that makes it MVCC safe by throwing a serialization error.
>
> I don't think I understand the proposal.  Under what circumstances
> would it throw a serialization error?

If you ask for TRUNCATE SERIALIZABLE then if someone asks for data in
the table and has a snapshot that can see earlier data then it will
throw a serializable error. So its a new kind of TRUNCATE that is MVCC
safe.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Hannu Krosing
Date:
On 11/08/2012 08:51 PM, Simon Riggs wrote:
> On 8 November 2012 17:07, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Nov 7, 2012 at 10:34 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> For 9.2 we discussed having COPY setting tuples as frozen. Various
>>> details apply.
>>> Earlier threads:
>>>    "RFC: Making TRUNCATE more "MVCC-safe"
>>>    "COPY with hints, rebirth"
>>>
>>> I was unhappy with changing the behaviour of TRUNCATE, and still am.
>>> So the proposal here is to have a specific modifier on TRUNCATE
>>> command that makes it MVCC safe by throwing a serialization error.
>> I don't think I understand the proposal.  Under what circumstances
>> would it throw a serialization error?
> If you ask for TRUNCATE SERIALIZABLE then if someone asks for data in
> the table and has a snapshot that can see earlier data then it will
> throw a serializable error. So its a new kind of TRUNCATE that is MVCC
> safe.
Can't we make it so that the reader with earlier snapshot sees the data 
from the pre-truncation file ?

and we unlink the base file(s) only once nobody has a snapshot the can 
see it ?

or are there some subtler problems (I was under impression that we 
already did this as described above) ?

----------------
Hannu



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Simon Riggs
Date:
On 8 November 2012 23:20, Hannu Krosing <hannu@2ndquadrant.com> wrote:
> On 11/08/2012 08:51 PM, Simon Riggs wrote:
>>
>> On 8 November 2012 17:07, Robert Haas <robertmhaas@gmail.com> wrote:
>>>
>>> On Wed, Nov 7, 2012 at 10:34 AM, Simon Riggs <simon@2ndquadrant.com>
>>> wrote:
>>>>
>>>> For 9.2 we discussed having COPY setting tuples as frozen. Various
>>>> details apply.
>>>> Earlier threads:
>>>>    "RFC: Making TRUNCATE more "MVCC-safe"
>>>>    "COPY with hints, rebirth"
>>>>
>>>> I was unhappy with changing the behaviour of TRUNCATE, and still am.
>>>> So the proposal here is to have a specific modifier on TRUNCATE
>>>> command that makes it MVCC safe by throwing a serialization error.
>>>
>>> I don't think I understand the proposal.  Under what circumstances
>>> would it throw a serialization error?
>>
>> If you ask for TRUNCATE SERIALIZABLE then if someone asks for data in
>> the table and has a snapshot that can see earlier data then it will
>> throw a serializable error. So its a new kind of TRUNCATE that is MVCC
>> safe.
>
> Can't we make it so that the reader with earlier snapshot sees the data from
> the pre-truncation file ?

We could... but that would require keeping a history of relfilenodes
for an object to allow for more than one TRUNCATE event. Tracking all
of that would be hard and I don't personally think its worth that
effort.

> and we unlink the base file(s) only once nobody has a snapshot the can see
> it ?

DELETE does that if that's the semantics you want.

> or are there some subtler problems (I was under impression that we already
> did this as described above) ?

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Hannu Krosing
Date:
On 11/09/2012 09:34 AM, Simon Riggs wrote:
> On 8 November 2012 23:20, Hannu Krosing <hannu@2ndquadrant.com> wrote:
>> On 11/08/2012 08:51 PM, Simon Riggs wrote:
>>> On 8 November 2012 17:07, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> On Wed, Nov 7, 2012 at 10:34 AM, Simon Riggs <simon@2ndquadrant.com>
>>>> wrote:
>>>>> For 9.2 we discussed having COPY setting tuples as frozen. Various
>>>>> details apply.
>>>>> Earlier threads:
>>>>>     "RFC: Making TRUNCATE more "MVCC-safe"
>>>>>     "COPY with hints, rebirth"
>>>>>
>>>>> I was unhappy with changing the behaviour of TRUNCATE, and still am.
>>>>> So the proposal here is to have a specific modifier on TRUNCATE
>>>>> command that makes it MVCC safe by throwing a serialization error.
>>>> I don't think I understand the proposal.  Under what circumstances
>>>> would it throw a serialization error?
>>> If you ask for TRUNCATE SERIALIZABLE then if someone asks for data in
>>> the table and has a snapshot that can see earlier data then it will
>>> throw a serializable error. So its a new kind of TRUNCATE that is MVCC
>>> safe.
>> Can't we make it so that the reader with earlier snapshot sees the data from
>> the pre-truncation file ?
> We could... but that would require keeping a history of relfilenodes
> for an object to allow for more than one TRUNCATE event.
MVCC does keep history of old relfilenodes.

I thought we were able to read old MVCC versions in pg_class for
this if the snapshot required data matching older pg_class record.
> Tracking all
> of that would be hard and I don't personally think its worth that
> effort.
>
>> and we unlink the base file(s) only once nobody has a snapshot the can see
>> it ?
> DELETE does that if that's the semantics you want.
>
>> or are there some subtler problems (I was under impression that we already
>> did this as described above) ?




Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Marti Raudsepp
Date:
On Wed, Nov 7, 2012 at 5:34 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> I was unhappy with changing the behaviour of TRUNCATE, and still am.
> So the proposal here is to have a specific modifier on TRUNCATE
> command that makes it MVCC safe by throwing a serialization error.
> That new behaviour should be requestable by adding the SERIALIZABLE
> keyword.
> i.e. TRUNCATE foo SERIALIZABLE;
> This then allows a new style of TRUNCATE, yet without modiying
> behaviour of earlier programs (ugh!).

Personally I think the behavior should be dictated by the *reader*.
The one doing the truncation may not know about the consistency
requirements of particular readers. Especially when you do the
truncate via pg_restore or some other generic tool. And indeed
different readers may have different consistency requirements.

So I'd prefer if it were a GUC variable; readers that accept relaxed
consistency can set truncate_conflict=off

Granted, making this part of the TRUNCATE statement does give you
table-level granularity. But if a reader can already handle
serialization conflicts, it doesn't really matter which table they
came from.

But making TRUNCATE behave in a MVCC-safe way, like proposed by Hannu,
would be much better.

Regards,
Marti



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Simon Riggs
Date:
On 9 November 2012 10:28, Marti Raudsepp <marti@juffo.org> wrote:
> On Wed, Nov 7, 2012 at 5:34 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> I was unhappy with changing the behaviour of TRUNCATE, and still am.
>> So the proposal here is to have a specific modifier on TRUNCATE
>> command that makes it MVCC safe by throwing a serialization error.
>> That new behaviour should be requestable by adding the SERIALIZABLE
>> keyword.
>> i.e. TRUNCATE foo SERIALIZABLE;
>> This then allows a new style of TRUNCATE, yet without modiying
>> behaviour of earlier programs (ugh!).
>
> Personally I think the behavior should be dictated by the *reader*.
> The one doing the truncation may not know about the consistency
> requirements of particular readers. Especially when you do the
> truncate via pg_restore or some other generic tool. And indeed
> different readers may have different consistency requirements.
>
> So I'd prefer if it were a GUC variable; readers that accept relaxed
> consistency can set truncate_conflict=off

I proposed something similar earlier, but Robert said he didn't like
that. The way you've described it here makes more sense, whereas my
parameter name made it seem more arbitrary. So I think that looks like
the way to go.

I'm trying to use this as a way to optimize COPY, so we'd need to make
a Seq Scan return zero rows if the truncatexid is seen as running by
the snapshot, which is the current behaviour. That seems easy enough.

> Granted, making this part of the TRUNCATE statement does give you
> table-level granularity. But if a reader can already handle
> serialization conflicts, it doesn't really matter which table they
> came from.

> But making TRUNCATE behave in a MVCC-safe way, like proposed by Hannu,
> would be much better.

Maybe, but one of the reasons for having a separate TRUNCATE command
rather than DELETE is the immediately removal of space. Changing
TRUNCATE so it suddenly holds on to space for longer will force us to
create a new command that acts like the old TRUNCATE.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Robert Haas
Date:
On Fri, Nov 9, 2012 at 8:22 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Personally I think the behavior should be dictated by the *reader*.
>> The one doing the truncation may not know about the consistency
>> requirements of particular readers. Especially when you do the
>> truncate via pg_restore or some other generic tool. And indeed
>> different readers may have different consistency requirements.
>>
>> So I'd prefer if it were a GUC variable; readers that accept relaxed
>> consistency can set truncate_conflict=off
>
> I proposed something similar earlier, but Robert said he didn't like
> that. The way you've described it here makes more sense, whereas my
> parameter name made it seem more arbitrary. So I think that looks like
> the way to go.

Hmm, I don't remember saying I didn't like that.  Maybe I disliked
something about a particular proposed implementation?  Actually, I
don't really see the need for this to be customizable at all.  I have
to believe that there is vanishingly little application code that
would care about this change in semantics, so why not just change the
behavior and call it good?

I think the question that hasn't really been adequately answered is:
where and how are we going to track conflicts?  Your previous patch
involved storing an XID in pg_class, but I think we both found that a
bit grotty - it'd probably need special handling for wraparound, and I
think we came up with some related cases that couldn't be handled in
the same way without adding a bunch more XIDs to various places.  I
don't really like the idea of having XIDs floating around in the
system catalogs - it seems like a recipe for bugs, not to mention that
storing ephemeral data in a persistent table seems like a mismatch.

What I've been wondering since this last came up is whether we could
use some variant of the SIREAD locks Kevin introduced for SSI to
handle this case - essentially have the transaction doing the TRUNCATE
make an entry in the lock table that will force a serialization
failure for any backend which accesses the table with a snapshot that
can't see the truncating transaction's XID.  The lock table entry
would need some kind of deferred clean-up, so it doesn't go away until
the locker's XID precedes RecentGlobalXmin.  Of course, an extra lock
table probe for every table access will be unacceptable from a
concurrency perspective, but we could probably optimize most of them
away by only checking the lock table if the pg_class row's own xmin is
new enough that the other backend's MVCC snapshot can't see it.  A
recent update to pg_class doesn't imply the existing of a lock, but
the absence of any recent update to pg_class does imply that no lock
can exist.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Simon Riggs
Date:
On 9 November 2012 14:01, Robert Haas <robertmhaas@gmail.com> wrote:

> I think the question that hasn't really been adequately answered is:
> where and how are we going to track conflicts?  Your previous patch
> involved storing an XID in pg_class, but I think we both found that a
> bit grotty - it'd probably need special handling for wraparound, and I
> think we came up with some related cases that couldn't be handled in
> the same way without adding a bunch more XIDs to various places.  I
> don't really like the idea of having XIDs floating around in the
> system catalogs - it seems like a recipe for bugs, not to mention that
> storing ephemeral data in a persistent table seems like a mismatch.

Yes, the xid only needs to be transient, not in pg_class.

> What I've been wondering since this last came up is whether we could
> use some variant of the SIREAD locks Kevin introduced for SSI to
> handle this case - essentially have the transaction doing the TRUNCATE
> make an entry in the lock table that will force a serialization
> failure for any backend which accesses the table with a snapshot that
> can't see the truncating transaction's XID.  The lock table entry
> would need some kind of deferred clean-up, so it doesn't go away until
> the locker's XID precedes RecentGlobalXmin.  Of course, an extra lock
> table probe for every table access will be unacceptable from a
> concurrency perspective, but we could probably optimize most of them
> away by only checking the lock table if the pg_class row's own xmin is
> new enough that the other backend's MVCC snapshot can't see it.  A
> recent update to pg_class doesn't imply the existing of a lock, but
> the absence of any recent update to pg_class does imply that no lock
> can exist.

I think the xid should still live in relcache, per the patch, but
should live in a transient place (and not pg_class).

We need a fast lookup structure that is expandable to accommodate
arbitrary numbers of truncates. Shared hash table, with some form of
overflow mechanism.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
"Kevin Grittner"
Date:
Robert Haas wrote:

> What I've been wondering since this last came up is whether we
> could use some variant of the SIREAD locks Kevin introduced for SSI
> to handle this case - essentially have the transaction doing the
> TRUNCATE make an entry in the lock table that will force a
> serialization failure for any backend which accesses the table with
> a snapshot that can't see the truncating transaction's XID.

It seems to me that the goal would be to make this semantically
idential to the behavior users would see if an unqualified DELETE
were run against the table rather than a TRUNCATE. To wit:

(1) Any attempt to read from the truncated table would not block. The
pg_class version included in the transaction's snapshot would
determine which heap and indexes were accessed. If the reading
transaction were SERIALIZABLE, it would generate a read-write
conflict out to the truncating transaction.

(2) Any attempt to write to the truncated table would block until the
end of the transaction which is doing the truncation. If the
truncating transaction rolls back, it proceeds normally against the
old data. Otherwise: If the transaction is READ COMMITTED, follow the
pg_class update links. At more strict isolation levels, generate a
write conflict error.

I'm not sure where any new use of the predicate locking system would
come into play in that, other than properly handling read-write
conflicts when both transactions were SERIALIZABLE.

This seems like a subset of the issues which one might want to
address by making DDL statement behave in a more strictly MVCC
fashion. Does it make sense to pick those off one at a time, or
should something like this be done only in the context of an overall
plan to deal with all of it?

-Kevin



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Marti Raudsepp
Date:
On Fri, Nov 9, 2012 at 4:24 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> We need a fast lookup structure that is expandable to accommodate
> arbitrary numbers of truncates. Shared hash table, with some form of
> overflow mechanism.

Surely you only need to remember the last completed truncate for each
relation? The latest one also invalidates any snapshots before earlier
truncates.

Regards,
Marti



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Simon Riggs
Date:
On 9 November 2012 14:55, Marti Raudsepp <marti@juffo.org> wrote:
> On Fri, Nov 9, 2012 at 4:24 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> We need a fast lookup structure that is expandable to accommodate
>> arbitrary numbers of truncates. Shared hash table, with some form of
>> overflow mechanism.
>
> Surely you only need to remember the last completed truncate for each
> relation?

Yes

> The latest one also invalidates any snapshots before earlier
> truncates.

1 per table, arbirary number of tables

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Simon Riggs
Date:
On 9 November 2012 14:22, Kevin Grittner <kgrittn@mail.com> wrote:
> Robert Haas wrote:
>
>> What I've been wondering since this last came up is whether we
>> could use some variant of the SIREAD locks Kevin introduced for SSI
>> to handle this case - essentially have the transaction doing the
>> TRUNCATE make an entry in the lock table that will force a
>> serialization failure for any backend which accesses the table with
>> a snapshot that can't see the truncating transaction's XID.
>
> It seems to me that the goal would be to make this semantically
> idential to the behavior users would see if an unqualified DELETE
> were run against the table rather than a TRUNCATE.

Unqualified DELETE already runs that way. TRUNCATE is a different
command for a reason. Making TRUNCATE like something we already have
seems not very useful to me, not least because it breaks existing
applications.

> This seems like a subset of the issues which one might want to
> address by making DDL statement behave in a more strictly MVCC
> fashion. Does it make sense to pick those off one at a time, or
> should something like this be done only in the context of an overall
> plan to deal with all of it?

TRUNCATE is not DDL, plus I have no interest in this other than
speeding up COPY.

Scope creep just kills features.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Merlin Moncure
Date:
On Fri, Nov 9, 2012 at 8:22 AM, Kevin Grittner <kgrittn@mail.com> wrote:
> Robert Haas wrote:
>
>> What I've been wondering since this last came up is whether we
>> could use some variant of the SIREAD locks Kevin introduced for SSI
>> to handle this case - essentially have the transaction doing the
>> TRUNCATE make an entry in the lock table that will force a
>> serialization failure for any backend which accesses the table with
>> a snapshot that can't see the truncating transaction's XID.
>
> It seems to me that the goal would be to make this semantically
> idential to the behavior users would see if an unqualified DELETE
> were run against the table rather than a TRUNCATE. To wit:

but, triggers would not fire, right?

merlin



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
"Kevin Grittner"
Date:
Merlin Moncure wrote:
> Kevin Grittner <kgrittn@mail.com> wrote:
>> Robert Haas wrote:

>> It seems to me that the goal would be to make this semantically
>> idential to the behavior users would see if an unqualified DELETE
>> were run against the table rather than a TRUNCATE.
> 
> but, triggers would not fire, right?

Right. Perhaps "identical" was too strong a word. I was referring to
the aspect under consideration here -- making it "serializable" in
line with other MVCC operations.

If we're not talking about making conflicts with other transactions
behave just the same as an unqualified DELETE from a user
perspective, I'm not sure what the goal is, exactly. Obviously we
would be keeping the guts of the implementation the same (swapping in
a new, empty heap).

-Kevin



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Simon Riggs
Date:
On 9 November 2012 15:34, Kevin Grittner <kgrittn@mail.com> wrote:

> If we're not talking about making conflicts with other transactions
> behave just the same as an unqualified DELETE from a user
> perspective, I'm not sure what the goal is, exactly.

Reasonable question.

My goal is to allow COPY to load frozen tuples without causing MVCC violations.

Altering TRUNCATE so it behaves perfectly from an MVCC/Serializable
perspective is a much bigger, and completely different goal, as well
as something I don't see as desirable anyway for at least 2 good
reasons, as explained. IMHO if people want MVCC/Serializable
semantics, use DELETE, possibly spending time to make unqualified
DELETE do some fancy TRUNCATE-like tricks with relfilenodes.

Forcing a tightly scoped proposal into a much wider one will just kill
this and leave it blocked.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Simon Riggs
Date:
On 9 November 2012 15:46, Simon Riggs <simon@2ndquadrant.com> wrote:

> Altering TRUNCATE so it behaves perfectly from an MVCC/Serializable
> perspective is a much bigger, and completely different goal, as well
> as something I don't see as desirable anyway for at least 2 good
> reasons, as explained. IMHO if people want MVCC/Serializable
> semantics, use DELETE, possibly spending time to make unqualified
> DELETE do some fancy TRUNCATE-like tricks with relfilenodes.

We spent a lot of time in 9.2 making TRUNCATE/reload of a table "just
work", rather than implementing a REPLACE command.

ISTM strange to throw away all that effort, changing behaviour of
TRUNCATE and thus forcing the need for a REPLACE command after all.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
"Kevin Grittner"
Date:
Simon Riggs wrote:

>> This seems like a subset of the issues which one might want to
>> address by making DDL statement behave in a more strictly MVCC
>> fashion. Does it make sense to pick those off one at a time, or
>> should something like this be done only in the context of an
>> overall plan to deal with all of it?
> 
> TRUNCATE is not DDL

You're right, I should have said utility commands.

> I have no interest in this other than speeding up COPY.

I would love to have that!

> Scope creep just kills features.

Well, I wasn't saying it should all be *done* at the same time, but
this is not the only utility command which could benefit from such an
effort, and if each one is done with no consideration of what it
takes for them all to be done, we could wind up with something that
doesn't hang together very coherently. Per perhaps this one could
serve as a "pilot", to identify issues and help develop such a plan.

-Kevin



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
"Kevin Grittner"
Date:
Simon Riggs wrote:

> My goal is to allow COPY to load frozen tuples without causing MVCC
> violations.

OK. That wasn't initially clear to me.

> Forcing a tightly scoped proposal into a much wider one will just
> kill this and leave it blocked.

The goal is important enough and narrow enough to merit the approach
you're talking about, IMV, at least. Sorry I initially misunderstood
what you were going for.

-Kevin



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On 9 November 2012 15:34, Kevin Grittner <kgrittn@mail.com> wrote:
>> If we're not talking about making conflicts with other transactions
>> behave just the same as an unqualified DELETE from a user
>> perspective, I'm not sure what the goal is, exactly.

> Reasonable question.

> My goal is to allow COPY to load frozen tuples without causing MVCC violations.

If that's the goal, I question why you're insisting on touching
TRUNCATE's behavior.  We already have the principle that "TRUNCATE is
like DELETE except not concurrent-safe".  Why not just invent a
non-concurrent-safe option to COPY that loads prefrozen tuples into a
new heap, and call it good?  There will be visibility oddness from that
definition, sure, but AFAICS there will be visibility oddness from what
you're talking about too.  You'll just have expended a very great deal
of effort to make the weirdness a bit different.  Even if the TRUNCATE
part of it were perfectly clean, the "load prefrozen tuples" part won't
be --- so I'm not seeing the value of changing TRUNCATE.
        regards, tom lane



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Robert Haas
Date:
On Fri, Nov 9, 2012 at 9:22 AM, Kevin Grittner <kgrittn@mail.com> wrote:
> (1) Any attempt to read from the truncated table would not block. The
> pg_class version included in the transaction's snapshot would
> determine which heap and indexes were accessed.

Well, the thing is, you can't actually do this.  When the transaction
commits, we truncate the main forks of the old heap and index and
remove all of the supplemental forks.  The main forks are finally
removed for good at the next checkpoint cycle.  To make this work,
we'd have to keep around the old heap and index until there were no
longer any MVCC snapshots that could see them.  That might be useful
as an option, but it would need a bunch of additional mechanism, and
it doesn't seem desirable as a default behavior because it could defer
disk space reclamation indefinitely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Robert Haas
Date:
On Fri, Nov 9, 2012 at 11:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> My goal is to allow COPY to load frozen tuples without causing MVCC violations.
>
> If that's the goal, I question why you're insisting on touching
> TRUNCATE's behavior.  We already have the principle that "TRUNCATE is
> like DELETE except not concurrent-safe".  Why not just invent a
> non-concurrent-safe option to COPY that loads prefrozen tuples into a
> new heap, and call it good?  There will be visibility oddness from that
> definition, sure, but AFAICS there will be visibility oddness from what
> you're talking about too.  You'll just have expended a very great deal
> of effort to make the weirdness a bit different.  Even if the TRUNCATE
> part of it were perfectly clean, the "load prefrozen tuples" part won't
> be --- so I'm not seeing the value of changing TRUNCATE.

I don't object to the idea of giving COPY a way to load prefrozen
tuples, but I think you might be missing the point here otherwise.
Right now, if you CREATE or TRUNCATE a table, copy a bunch of data
into it, and then commit, another transaction that took a snapshot
before your commit can subsequently look at that table and it will NOT
see your newly-loaded data.  What it will see instead is an empty
table.  This is, of course, wrong: it ought to fail with a
serialization error.  It is very possible that the table has never
been empty at the conclusion of a completed transaction: it might have
contained data before the TRUNCATE, and it might again contain data by
the time the truncating transaction commits.  Yet, we see it as empty,
which is not MVCC-compliant.

If we were to make COPY pre-freeze the data when the table was created
or truncated in the same transaction, it would alter the behavior in
this situation, and from an application perspective, only this
situation.  Now, instead of seeing the table as empty, you'd see the
new contents.  This is also not MVCC-compliant, and I guess the
concern when we have talked about this topic before is that changing
from wrong behavior to another, not-backward-compatible wrong behavior
might not be the friendliest thing to do.  We could decide we don't
care and just break it.  Or we could try to make it through a
serialization error, as Simon is proposing here, which seems like the
tidiest solution.  Or we could keep the old heap around until there
are no more snapshots that can need it, which is a bit scary since
we'd be eating double disk-space in the meantime, but it would
certainly be useful to some users, I think.

Just having an option to preload frozen tuples dodges all of these
issues by throwing our hands up in the air, but it does have the
advantage of being more general.  Even if we do that I'm not sure it
would be a bad thing to try to solve this issue in a somewhat more
principled way, but it would surely reduce the urgency.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Just having an option to preload frozen tuples dodges all of these
> issues by throwing our hands up in the air, but it does have the
> advantage of being more general.  Even if we do that I'm not sure it
> would be a bad thing to try to solve this issue in a somewhat more
> principled way, but it would surely reduce the urgency.

Yeah.  ISTM the whole point of TRUNCATE is "I don't care about
serializability for this operation, give me efficiency instead".
So I see nothing wrong with a (non-default) option for COPY that
similarly trades away some semantic guarantees for efficiency's sake.
There are an awful lot of bulk-load scenarios where people will gladly
take that trade, and are not very interested in halfway points either.
        regards, tom lane



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Simon Riggs
Date:
On 9 November 2012 16:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
>> On 9 November 2012 15:34, Kevin Grittner <kgrittn@mail.com> wrote:
>>> If we're not talking about making conflicts with other transactions
>>> behave just the same as an unqualified DELETE from a user
>>> perspective, I'm not sure what the goal is, exactly.
>
>> Reasonable question.
>
>> My goal is to allow COPY to load frozen tuples without causing MVCC violations.
>
> If that's the goal, I question why you're insisting on touching
> TRUNCATE's behavior.  We already have the principle that "TRUNCATE is
> like DELETE except not concurrent-safe".  Why not just invent a
> non-concurrent-safe option to COPY that loads prefrozen tuples into a
> new heap, and call it good?  There will be visibility oddness from that
> definition, sure, but AFAICS there will be visibility oddness from what
> you're talking about too.  You'll just have expended a very great deal
> of effort to make the weirdness a bit different.  Even if the TRUNCATE
> part of it were perfectly clean, the "load prefrozen tuples" part won't
> be --- so I'm not seeing the value of changing TRUNCATE.

This is wonderful thought and I wish I'd thought of it. My digression
via truncate now annoys me.

Yes, there are objections and I've read what Robert has said. An
explicit new option is perfectly entitled to introduce new behaviour
and won't cause a problem with existing applications. I personally
don't care about serializable stuff here, and nor do most others. They
just want a way to load new data quickly. It is important to me that
we do things in well principled ways, but the task at hand is data
loading not perfect visibility guarantees.

So what we're talking about here is a new mode for COPY, that when
requested will pre-freeze tuples when loading into a newly
created/truncated table. If the table isn't newly created/truncated
then we'll just ignore it and continue. I see no need to throw an
error, since that will just cause annoying usability issues.

COPY FREEZE here we come, with extensive docs to explain the trade-off
the user is accepting.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Robert Haas
Date:
On Fri, Nov 9, 2012 at 3:31 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> So what we're talking about here is a new mode for COPY, that when
> requested will pre-freeze tuples when loading into a newly
> created/truncated table. If the table isn't newly created/truncated
> then we'll just ignore it and continue. I see no need to throw an
> error, since that will just cause annoying usability issues.

Actually, why not just have it work always?  If people want to load
frozen tuples into a table that's not newly created/truncated, why not
let them?  Sure, there could be MVCC violations, but as long as the
behavior is opt-in, who cares?  I think it'd be useful to a lot of
people.

If we want to reduce (not eliminate) the potential MVCC issues, which
I think would be a good idea, we could take AccessExclusiveLock on the
table when COPY (FREEZE) is used.  Someone using an old snapshot but
accessing the table for the first time after AEL is released could
still see MVCC anomalies, but at least it would rule out things
changing in mid-query, which is the case that I think would be most
problematic.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Nov 9, 2012 at 3:31 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> So what we're talking about here is a new mode for COPY, that when
>> requested will pre-freeze tuples when loading into a newly
>> created/truncated table. If the table isn't newly created/truncated
>> then we'll just ignore it and continue. I see no need to throw an
>> error, since that will just cause annoying usability issues.

> Actually, why not just have it work always?  If people want to load
> frozen tuples into a table that's not newly created/truncated, why not
> let them?  Sure, there could be MVCC violations, but as long as the
> behavior is opt-in, who cares?  I think it'd be useful to a lot of
> people.

I thought about that too, but there's a big problem.  It wouldn't be
just MVCC that would be broken, but transactional integrity: if the
COPY fails partway through, the already-loaded rows still look valid.
The new-file requirement provides a way to roll them back.

I'm willing to have an option that compromises MVCC semantics
transiently, but giving up transactional integrity seems a bit much.
        regards, tom lane



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Robert Haas
Date:
On Mon, Nov 12, 2012 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Nov 9, 2012 at 3:31 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> So what we're talking about here is a new mode for COPY, that when
>>> requested will pre-freeze tuples when loading into a newly
>>> created/truncated table. If the table isn't newly created/truncated
>>> then we'll just ignore it and continue. I see no need to throw an
>>> error, since that will just cause annoying usability issues.
>
>> Actually, why not just have it work always?  If people want to load
>> frozen tuples into a table that's not newly created/truncated, why not
>> let them?  Sure, there could be MVCC violations, but as long as the
>> behavior is opt-in, who cares?  I think it'd be useful to a lot of
>> people.
>
> I thought about that too, but there's a big problem.  It wouldn't be
> just MVCC that would be broken, but transactional integrity: if the
> COPY fails partway through, the already-loaded rows still look valid.
> The new-file requirement provides a way to roll them back.
>
> I'm willing to have an option that compromises MVCC semantics
> transiently, but giving up transactional integrity seems a bit much.

Hmm, good point.  There might be some way around that, but figuring it
out is probably material for a separate patch.

But I guess that raises the question - should COPY (FREEZE) silently
ignore the option for not-new relfilenodes, or should it error out?
Simon proposed the former, but I'm wondering if the latter would be
better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: TRUNCATE SERIALIZABLE and frozen COPY

From
Simon Riggs
Date:
On 12 November 2012 16:22, Robert Haas <robertmhaas@gmail.com> wrote:

> But I guess that raises the question - should COPY (FREEZE) silently
> ignore the option for not-new relfilenodes, or should it error out?
> Simon proposed the former, but I'm wondering if the latter would be
> better.

It's got some complex pre-conditions, so having scripts fail because
you mis-specified FREEZE would be annoying.

The option indicates "I accept the potential MVCC violation", not "it
will always freeze".

If there is a better name...

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services