Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Date
Msg-id CAM3SWZQdv7GDLwPRv7=rE-gG1QjLOOL3vCmAriCBcTYk8GwqKw@mail.gmail.com
Whole thread Raw
In response to Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Dec 22, 2014 at 1:24 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> I feel that it needs to be possible to specify the constraint unambiguously
> in all cases. These are very rare use cases, but we should have an escape
> hatch for the rare cases that need it.
>
>
> What would it take to also support partial indexes?

Aside from considerations about how to pick them without using their
name, partial unique indexes aren't special at all. My earlier concern
was that we'd need to account for before row insert triggers that
change values out from under us. But maybe that concern was overblown,
come to think of it. I am already borrowing a little bit of the raw
parser's logic for CREATE INDEX statements for unique index inference
(during parse analysis) -- we're matching the cataloged index
definition attributes/expressions, so this makes a lot of sense. Maybe
I had the wrong idea about partial indexes earlier, which was that we
must use the values in the tuple proposed for insertion to check that
a partial index was a suitable arbiter of whether or not the UPDATE
path should be taken in respect of any given tuple. I should just go
further with borrowing things from CREATE INDEX, and give the user an
optional way of specifying a WHERE clause that is also matched in a
similar way to the expressions themselves. Did the partial unique
index your UPSERT implied not cover the ultimate tuple inserted after
before row insert triggers fired? That's on you as a user...you'll
always get an insert, since there won't be a would-be duplicate
violation to make there be an update.

I actually care about partial unique indexes a lot. They're a very
useful feature. Back when I was an application developer, I frequently
used "is_active" boolean columns to represent "logical app-level
deletion", where actually deleting the tuple was not possible (e.g.
because it may still be referenced in historic records), while not
wanting to have it be subject to uniqueness checks as a logically
deleted/!is_active tuple.

This measure to support partial indexes, plus the additional leeway
around non-default opclass unique indexes that I can add (that they
need only match the "equals" operator of the default opclass to be
accepted) brings us 99.9% of the way. That only leaves:

* An inability to specifying some subset of unique indexes or
exclusion constraints for the IGNORE variant (the UPDATE variant is
irrelevant).

* An inability to specifying a IGNORE arbitrating *exclusion
constraint* as the sole arbiter of whether or not the IGNORE path
should be taken. (exclusion constraints are not usable for the UPDATE
variant, so that's irrelevant again).

Did I forget something?

The use cases around these limitations are very rare, and only apply
to the IGNORE variant which seems much less interesting. I'm quite
comfortable dealing with them in a later release of PostgreSQL, to cut
scope (or avoid adding scope) for 9.5. Do you think that's okay? How
often will the IGNORE variant be used when everything shouldn't be
IGNOREd anyway? Although, to be totally fair, I should probably also
include:

* non-default B-tree opclasses cannot be specified as arbiters of the
alternative path (for both IGNORE and UPDATE variants) iff their
"equals" operator happens to not be the "equals" operator of the
default opclass (which is theoretical, and likely non-existent as a
use case).

If you're dead set on having an escape hatch, maybe we should just get
over it and add a way of specifying a unique index by name. As I said,
these under-served use cases are either exceedingly rare or entirely
theoretical.
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: Proposal "VACUUM SCHEMA"
Next
From: Stephen Frost
Date:
Subject: Re: Role Attribute Bitmask Catalog Representation