Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0 - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
Date
Msg-id CAM3SWZTtVLdWx5HwziE459y9mscjcQ7iZDKG8G9d4Q8awPNOrQ@mail.gmail.com
Whole thread Raw
In response to Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0  (Gavin Flower <GavinFlower@archidevsys.co.nz>)
List pgsql-hackers
On Thu, Mar 26, 2015 at 2:51 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Attached revision, V3.1, implements this slightly different way of
>> figuring out if an insertion is speculative, as discussed. We reuse
>> t_ctid for speculatively inserted tuples. That's where the counter
>> goes. I think that this is a significant improvement, since there is
>> no longer any need to touch the proc array for any reason, without
>> there being any significant disadvantage that I'm aware of. I also
>> fixed some bitrot, and a bug with index costing (the details aren't
>> terribly interesting - tuple width wasn't being calculated correctly).
>
> Cool. Quickly looking at the patch though - does it actually work as it is?

The test cases pass, including jjanes_upsert, and stress tests that
test for unprincipled deadlocks. But yes, I am entirely willing to
believe that something that was written in such haste could be broken.
My manual testing was pretty minimal.

Sorry for posting a shoddy patch, but I thought it was more important
to show you that this is perfectly workable ASAP.

> RelationPutHeapTuple will overwrite the ctid field when the tuple is put on
> the page, so I don't think the correct token will make it to disk as the
> patch stands.

Oops - You're right. I find it interesting that this didn't result in
breaking my test cases. I guess that not having proc array locking
might have made the difference with unprincipled deadlocks, which I
could not recreate (and row locking saves us from breaking UPSERT, I
think - although if so the token lock would still certainly be needed
for the IGNORE variant). It is interesting that this wasn't obviously
broken for UPSERT, though. I think it at least suggests that when
testing, we need to be more careful with taking a working UPSERT as a
proxy for a working ON CONFLICT IGNORE.

> Also, there are a few places where we currently check if
> t_ctid equals the tuple's location, and try to follow t_ctid if it doesn't.
> I think those need to be taught that t_ctid can also be a token.

I did fix at least some of those. I thought that the choke point for
doing that was fairly small, entirely confined to one or two routines
with heapam.c. But it would surely be better to follow your suggestion
of using an invalid/magic tuple offset value to be sure that it cannot
possibly occur elsewhere. And I'm still using that infomask2 bit,
which is probably not really necessary. So that needs to change too.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: Remove fsync ON/OFF as a visible option?
Next
From: Jim Nasby
Date:
Subject: Re: trying to study how sorting works