Home > mailing lists

Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0 - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
Date	April 17, 2015 08:39:02
Msg-id	20150417083856.GC20014@alap3.anarazel.de Whole thread Raw
In response to	Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0 (Peter Geoghegan <pg@heroku.com>)
Responses	Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
List	pgsql-hackers

Tree view

On 2015-04-16 09:43:54 -0700, Peter Geoghegan wrote:
> On Thu, Apr 16, 2015 at 2:23 AM, Andres Freund <andres@anarazel.de> wrote:
> > I'm, completely independent of logical decoding, of the *VERY* strong
> > opinion that 'speculative insertions' should never be visible when
> > looking with normal snapshots. For one it allows to simplify
> > considerations around wraparound (which has proven to be a very good
> > idea, c.f. multixacts + vacuum causing data corruption years after it
> > was thought to be harmless). For another it allows to reclaim/redefine
> > the bit after a database restart/upgrade. Given how complex this is and
> > how scarce flags are that seems like a really good property.
> >
> > And avoiding those flags to be visible to the outside requires a WAL
> > record, otherwise it won't be correct on the standby.
> 
> I'm a bit distracted here, and not sure exactly what you mean. What's
> a normal snapshot?

Normal visibility semantics, i.e. SnapshotMVCC, not SnapshotDirty.

> Do you just mean that you think that speculative insertions should be
> explicitly affirmed by a second record (making it not a speculative
> tuple, but rather, a fully fledged tuple)? IOW, an MVCC snapshot has
> no chance of seeing a tuple until it was affirmed by a second in-place
> modification, regardless of tuple xmin xact commit status?

Yes. I think

a) HEAP_SPECULATIVE should never be visible outside in a
committed transaction. That allows us to redefine what exactly the bit
means and such after a simple restart. On IM Heiki said he wants to
replace this by a bit in the item pointer, but I don't think that
changes things substantially.

b) t_ctid should not contain a speculative token in committed
(i.e. visible to other sessions using mvcc semantics) tuple. Right now
(i.e. master) t_ctid will point to itself for non-updated tuples. I
don't think it's good to have something in there that's not an actual
ctid in there. The number of places that look into t_ctid for
in-progress insertions of other sessions is smaller than the ones that
look at tuples in general.

c) Cleaning the flag/ctid after a completed speculative insertion makes
it far less likely that we end up waiting on a other backend when we
wouldn't have to. If a session inserts a large number of tuples
speculatively (surely *not* a unlikely thing in the long run) it gets
rather likely that tokens are reused. Which means if another backend
touches these in-progress records it's quite possible that the currently
acquired token is the same as the one on a tuple that has actually
finished inserting.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Andres Freund
Date: 17 April 2015, 08:31:05
Subject: Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

From: Andres Freund
Date: 17 April 2015, 08:55:12
Subject: Re: Replication identifiers, take 4

Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0 - Mailing list pgsql-hackers

Previous

Next