Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0 - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
Date
Msg-id CAM3SWZRaJG+7jDhPJFhgfcsMWh1KquZZCGNa-D9eTBBVb4+Fng@mail.gmail.com
Whole thread Raw
In response to Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
List pgsql-hackers
On Wed, Mar 18, 2015 at 2:41 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Here's what I had in mind: the inserter tags the tuple with the speculative
> insertion token, by storing the token in the t_ctid field. If the inserter
> needs to super-delete the tuple, it sets xmax like in a regular deletion,
> but also sets another flag to indicate that it was a super-deletion.

I was able to quickly hack up a prototype of this in my hotel room at
pgConf.US. It works fine at first blush, passing the jjanes_upsert
stress tests and my own regression tests without a problem. Obviously
it needs more testing and clean-up before posting, but I was pleased
with how easy this was.

> When another backend inserts, and notices that it has a potential conflict
> with the first tuple, it tries to acquire a hw-lock on the token. In most
> cases, the inserter has long since completed the insertion, and the
> acquisition succeeds immediately but you have to check because the token is
> not cleared on a completed insertion.

You don't even have to check/take a ShareLock on the token when the
other xact committed/aborted, because you know that if it is there,
then based on that (and based on the fact that it wasn't super
deleted) the tuple is visible/committed, or (in the event of
other-xact-abort) not visible/aborted. In other words, we continue to
only check for a speculative token when the inserting xact is in
flight - we just take the token from the heap now instead. Not much
needs to change, AFAICT.

> Regarding the physical layout: We can use a magic OffsetNumber value above
> MaxOffsetNumber to indicate that the t_ctid field stores a token rather than
> a regular ctid value. And another magic t_ctid value to indicate that a
> tuple has been super-deleted. The token and the super-deletion flag are
> quite ephemeral, they are not needed after the inserting transaction has
> completed, so it's nice to not consume the valuable infomask bits for these
> things. Those states are conveniently not possible on an updated tuple, when
> we would need the t_ctid field for it's current purpose.

Haven't done anything about this yet. I'm just using an infomask2 bit
for now. Although that was only because I forgot that you suggested
this before having a go at implementing this new t_ctid scheme!

My next revision will have a more polished version of this scheme. I'm
not going to immediately act on Robert's feedback elsewhere (although
I'd like to), owing to time constraints - no reason to deny you the
opportunity to review the entirely unrelated low-level speculative
locking mechanism due to that.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Thom Brown
Date:
Subject: Re: Parallel Seq Scan
Next
From: Heikki Linnakangas
Date:
Subject: Re: What exactly is our CRC algorithm?