Thread: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

[PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Aleksander Alekseev

Date:

25 January 2023, 15:45:12

Hi hackers,

Currently we allow self-conflicting inserts for ON CONFLICT DO NOTHING:

```
CREATE TABLE t (a INT UNIQUE, b INT);
INSERT INTO t VALUES (1,1), (1,2) ON CONFLICT DO NOTHING;
-- succeeds, inserting the first row and ignoring the second
```
... but not for ON CONFLICT .. DO UPDATE:

```
INSERT INTO t VALUES (1,1), (1,2) ON CONFLICT (a) DO UPDATE SET b = 0;
ERROR:  ON CONFLICT DO UPDATE command cannot affect row a second time
HINT: Ensure that no rows proposed for insertion within the same
command have duplicate constrained values.
```

Tom pointed out in 2016 that this is actually a bug [1] and I agree.

The proposed patch fixes this.

[1]: https://www.postgresql.org/message-id/22438.1477265185%40sss.pgh.pa.us

-- 
Best regards,
Aleksander Alekseev

Attachment

v1-0001-Make-ON-CONFLICT-DO-NOTHING-and-ON-CONFLICT-DO-UP.patch

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Andres Freund

Date:

25 January 2023, 18:34:47

Hi,

On 2023-01-25 18:45:12 +0300, Aleksander Alekseev wrote:
> Currently we allow self-conflicting inserts for ON CONFLICT DO NOTHING:
> 
> ```
> CREATE TABLE t (a INT UNIQUE, b INT);
> INSERT INTO t VALUES (1,1), (1,2) ON CONFLICT DO NOTHING;
> -- succeeds, inserting the first row and ignoring the second
> ```
> ... but not for ON CONFLICT .. DO UPDATE:
> 
> ```
> INSERT INTO t VALUES (1,1), (1,2) ON CONFLICT (a) DO UPDATE SET b = 0;
> ERROR:  ON CONFLICT DO UPDATE command cannot affect row a second time
> HINT: Ensure that no rows proposed for insertion within the same
> command have duplicate constrained values.
> ```
> 
> Tom pointed out in 2016 that this is actually a bug [1] and I agree.

I don't think I agree with this being a bug.

We can't sensible implement updating a row twice within a statement - hence
erroring out for ON CONFLICT DO UPDATE affecting a row twice. But what's the
justification for erroring out in the DO NOTHING case? ISTM that it's useful
to be able to handle such duplicates, and I don't immediately see what
semantic confusion or implementation difficulty we avoid by erroring out.

It seems somewhat likely that a behavioural change will cause trouble for some
of the uses of DO NOTHING out there.

Greetings,

Andres Freund

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Aleksander Alekseev

Date:

25 January 2023, 19:00:50

Hi Andres,

> I don't think I agree with this being a bug.

Perhaps that's not a bug especially considering the fact that the
documentation describes this behavior, but in any case the fact that:

```
INSERT INTO t VALUES (1,1) ON CONFLICT (a) DO UPDATE SET b = 0;
INSERT INTO t VALUES (1,2) ON CONFLICT (a) DO UPDATE SET b = 0;
```

and:

```
INSERT INTO t VALUES (1,1), (1,2) ON CONFLICT (a) DO NOTHING;
``

.. both work, and:

```
INSERT INTO t VALUES (1,1), (1,2) ON CONFLICT (a) DO UPDATE SET b = 0;
```

... doesn't is rather confusing. There is no reason why the latest
query shouldn't work except for a slight complication of the code.
Which seems to be a reasonable tradeoff, for me at least.

> But what's the justification for erroring out in the DO NOTHING case?
>
> [...]
>
> It seems somewhat likely that a behavioural change will cause trouble for some
> of the uses of DO NOTHING out there.

Just to make sure we are on the same page. The patch doesn't break the
current DO NOTHING behavior but rather makes DO UPDATE work the same
way DO NOTHING does.

-- 
Best regards,
Aleksander Alekseev

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Peter Geoghegan

Date:

25 January 2023, 19:08:04

On Wed, Jan 25, 2023 at 11:01 AM Aleksander Alekseev
<aleksander@timescale.com> wrote:
> Just to make sure we are on the same page. The patch doesn't break the
> current DO NOTHING behavior but rather makes DO UPDATE work the same
> way DO NOTHING does.

It also makes DO UPDATE not work the same way as either UPDATE itself
(which will silently skip a second or subsequent update of the same
row by the same UPDATE statement in RC mode), or MERGE (which has
similar cardinality violations).

DO NOTHING doesn't lock any conflicting row, and so won't have to
dirty pages that have matching rows. It was always understood to be
more susceptible to certain issues (when in READ COMMITTED mode) as a
result. There are some halfway reasonable arguments against this sort
of behavior, but I believe that we made the right trade-off.

-- 
Peter Geoghegan

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Aleksander Alekseev

Date:

25 January 2023, 19:27:38

Hi Peter,

> It also makes DO UPDATE not work the same way as either UPDATE itself
> (which will silently skip a second or subsequent update of the same
> row by the same UPDATE statement in RC mode), or MERGE (which has
> similar cardinality violations).

That's true. On the flip side, UPDATE and MERGE are different
statements and arguably shouldn't behave the same way INSERT does.

To clarify, I'm merely proposing the change and playing the role of
Devil's advocate here. I'm not arguing that the patch should be
necessarily accepted. In the end of the day it's up to the community
to decide. Personally I think it would make the users a bit happier.

The actual reason why I made the patch is that a colleague of mine,
Sven Klemm, encountered this limitation recently and was puzzled by it
and so was I. The only workaround the user currently has is to execute
several INSERTs one by one which is expensive when you have a lot of
INSERTs.

-- 
Best regards,
Aleksander Alekseev

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Andres Freund

Date:

25 January 2023, 22:34:32

Hi,

On 2023-01-25 22:00:50 +0300, Aleksander Alekseev wrote:
> Perhaps that's not a bug especially considering the fact that the
> documentation describes this behavior, but in any case the fact that:
> 
> ```
> INSERT INTO t VALUES (1,1) ON CONFLICT (a) DO UPDATE SET b = 0;
> INSERT INTO t VALUES (1,2) ON CONFLICT (a) DO UPDATE SET b = 0;
> ```
> 
> and:
> 
> ```
> INSERT INTO t VALUES (1,1), (1,2) ON CONFLICT (a) DO NOTHING;
> ``
> 
> .. both work, and:
> 
> ```
> INSERT INTO t VALUES (1,1), (1,2) ON CONFLICT (a) DO UPDATE SET b = 0;
> ```
> 
> ... doesn't is rather confusing. There is no reason why the latest
> query shouldn't work except for a slight complication of the code.
> Which seems to be a reasonable tradeoff, for me at least.

I don't agree that this is just about a "slight complication" of the code. I
think semantically the proposed new behaviour is pretty bogus.

It *certainly* can't be right to just continue with the update in heap_update,
as you've done. You'd have to skip the update, not execute it. What am I
missing here?

I think this'd completely break triggers, for example, because they won't be
able to get the prior row version, since it won't actually be a row ever
visible (due to cmin=cmax).

I suspect it might break unique constraints as well, because we'd end up with
an invisible row in part of the ctid chain.

> > But what's the justification for erroring out in the DO NOTHING case?
> >
> > [...]
> >
> > It seems somewhat likely that a behavioural change will cause trouble for some
> > of the uses of DO NOTHING out there.
> 
> Just to make sure we are on the same page. The patch doesn't break the
> current DO NOTHING behavior but rather makes DO UPDATE work the same
> way DO NOTHING does.

I see that now - I somehow thought you were recommending to error out in both
cases, rather than the other way round.

Greetings,

Andres Freund

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Aleksander Alekseev

Date:

26 January 2023, 10:07:08

Hi Andres,

> It *certainly* can't be right to just continue with the update in heap_update,

I see no reason why. What makes this case so different from updating a
tuple created by the previous command?

> as you've done. You'd have to skip the update, not execute it. What am I
> missing here?

Simply skipping updates in a statement that literally says DO UPDATE
doesn't seem to be the behavior a user would expect.

> I think this'd completely break triggers, for example, because they won't be
> able to get the prior row version, since it won't actually be a row ever
> visible (due to cmin=cmax).
>
> I suspect it might break unique constraints as well, because we'd end up with
> an invisible row in part of the ctid chain.

That's a reasonable concern, however I was unable to break unique
constraints or triggers so far:

```
CREATE TABLE t (a INT UNIQUE, b INT);

CREATE OR REPLACE FUNCTION t_insert() RETURNS TRIGGER AS $$
BEGIN
  RAISE NOTICE 't_insert triggered: new = %, old = %', NEW, OLD;
  RETURN NULL;
END
$$ LANGUAGE 'plpgsql';

CREATE OR REPLACE FUNCTION t_update() RETURNS TRIGGER AS $$
BEGIN
  RAISE NOTICE 't_update triggered: new = %, old = %', NEW, OLD;
  RETURN NULL;
END
$$ LANGUAGE 'plpgsql';

CREATE TRIGGER t_insert_trigger
AFTER INSERT ON t
FOR EACH ROW EXECUTE PROCEDURE t_insert();

CREATE TRIGGER t_insert_update
AFTER UPDATE ON t
FOR EACH ROW EXECUTE PROCEDURE t_update();

INSERT INTO t VALUES (1,1), (1,2) ON CONFLICT (a) DO UPDATE SET b = 0;

NOTICE:  t_insert triggered: new = (1,1), old = <NULL>
NOTICE:  t_update triggered: new = (1,0), old = (1,1)

INSERT INTO t VALUES (2,1), (2,2), (3,1) ON CONFLICT (a) DO UPDATE SET b = 0;

NOTICE:  t_insert triggered: new = (2,1), old = <NULL>
NOTICE:  t_update triggered: new = (2,0), old = (2,1)
NOTICE:  t_insert triggered: new = (3,1), old = <NULL>

=# SELECT * FROM t;
 a | b
---+---
 1 | 0
 2 | 0
 3 | 1
```

PFA patch v2 that also includes the test shown above.

Are there any other scenarios we should check?

-- 
Best regards,
Aleksander Alekseev

Attachment

v2-0001-Make-ON-CONFLICT-DO-NOTHING-and-ON-CONFLICT-DO-UP.patch

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Andres Freund

Date:

07 February 2023, 18:27:25

Hi,

On 2023-01-26 13:07:08 +0300, Aleksander Alekseev wrote:
> > It *certainly* can't be right to just continue with the update in heap_update,
> 
> I see no reason why. What makes this case so different from updating a
> tuple created by the previous command?

To me it's a pretty fundamental violation of how heap visibility works. I'm
quite sure that there will be problems, but I don't feel like investing the
time to find a reproducer for something that I'm ready to reject on principle.

> > as you've done. You'd have to skip the update, not execute it. What am I
> > missing here?
> 
> Simply skipping updates in a statement that literally says DO UPDATE
> doesn't seem to be the behavior a user would expect.

Given that we skip the update in "UPDATE", your argument doesn't hold much
water.

> > I think this'd completely break triggers, for example, because they won't be
> > able to get the prior row version, since it won't actually be a row ever
> > visible (due to cmin=cmax).
> >
> > I suspect it might break unique constraints as well, because we'd end up with
> > an invisible row in part of the ctid chain.
> 
> That's a reasonable concern, however I was unable to break unique
> constraints or triggers so far:

I think you'd have to do a careful analysis of a lot of code for that to hold
any water.

I continue to think that we should just reject this behavioural change.

Greetings,

Andres Freund

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Aleksander Alekseev

Date:

08 February 2023, 13:08:39

Hi,

> To me it's a pretty fundamental violation of how heap visibility works.

I don't think this has much to do with heap visibility. It's true that
generally a command doesn't see its own tuples. This is done in order
to avoid the Halloween problem which however can't happen in this
particular case.

Other than that the heap doesn't care about the visibility, it merely
stores the tuples. The visibility is determined by xmin/xmax, the
isolation level, etc.

It's true that the patch changes visibility rules in one very
particular edge case. This alone is arguably not a good enough reason
to reject a patch.

> Given that we skip the update in "UPDATE", your argument doesn't hold much
> water.

Peter made this argument above too and I will give the same anwer.
There is no reason why two completely different SQL statements should
behave the same.

> > That's a reasonable concern, however I was unable to break unique
> > constraints or triggers so far:
>
> I think you'd have to do a careful analysis of a lot of code for that to hold
> any water.

Alternatively we could work smarter, not harder, and let the hardware
find the bugs for us. Writing tests is much simpler and bullet-proof
than analyzing the code.

Again, to clarify, I'm merely playing the role of Devil's advocate
here. I'm not saying that the patch should necessarily be accepted,
nor am I 100% sure that it has any undiscovered bugs. However the
arguments against received so far don't strike me personally as being
particularly convincing.

As an example, one could argue that there are applications that
*expect* to get an ERROR in the case of self-conflicting inserts. And
by changing this behavior we will break these applications. If the
majority believes that we seriously care about this it would be a good
enough reason to withdraw the patch.

-- 
Best regards,
Aleksander Alekseev

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Andres Freund

Date:

08 February 2023, 15:49:50

Hi,

On 2023-02-08 16:08:39 +0300, Aleksander Alekseev wrote:
> > To me it's a pretty fundamental violation of how heap visibility works.
> 
> I don't think this has much to do with heap visibility. It's true that
> generally a command doesn't see its own tuples. This is done in order
> to avoid the Halloween problem which however can't happen in this
> particular case.
> 
> Other than that the heap doesn't care about the visibility, it merely
> stores the tuples. The visibility is determined by xmin/xmax, the
> isolation level, etc.

Yes, and the fact is that cmin == cmax is something that we don't normally
produce, yet you emit it now, without, as far as I can tell it, a convincing
reason.

> > > That's a reasonable concern, however I was unable to break unique
> > > constraints or triggers so far:
> >
> > I think you'd have to do a careful analysis of a lot of code for that to hold
> > any water.
> 
> Alternatively we could work smarter, not harder, and let the hardware
> find the bugs for us. Writing tests is much simpler and bullet-proof
> than analyzing the code.

That's a spectactularly wrong argument in almost all cases. Unless you have a
way to get to full branch coverage or use a model checker that basically does
the same, testing isn't going to give you a whole lot of confidence that you
haven't introduced bugs. This is particularly true for something like heapam,
where a lot of the tricky behaviour requires complicated interactions between
multiple connections.

> Again, to clarify, I'm merely playing the role of Devil's advocate
> here. I'm not saying that the patch should necessarily be accepted,
> nor am I 100% sure that it has any undiscovered bugs. However the
> arguments against received so far don't strike me personally as being
> particularly convincing.

I've said my piece, as-is I vote to reject the patch.

Greetings,

Andres Freund

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Peter Geoghegan

Date:

08 February 2023, 17:34:59

On Wed, Feb 8, 2023 at 5:08 AM Aleksander Alekseev
<aleksander@timescale.com> wrote:
> > To me it's a pretty fundamental violation of how heap visibility works.
>
> I don't think this has much to do with heap visibility. It's true that
> generally a command doesn't see its own tuples. This is done in order
> to avoid the Halloween problem which however can't happen in this
> particular case.
>
> Other than that the heap doesn't care about the visibility, it merely
> stores the tuples. The visibility is determined by xmin/xmax, the
> isolation level, etc.

I think that in a green field situation we would probably make READ
COMMITTED updates throw cardinality violations in the same way as ON
CONFLICT DO UPDATE, while not changing anything about ON CONFLICT DO
NOTHING. We made a deliberate trade-off with the design of DO NOTHING,
which won't lock conflicting rows, and so won't dirty any heap pages
that it doesn't insert on to.

I don't buy your argument about DO UPDATE needing to be brought into
line with DO NOTHING. In any case I'm pretty sure that Tom's remarks
in 2016 about a behavioral inconsistencies (which you cited) actually
called for making DO NOTHING more like DO UPDATE -- not the other way
around.

To me it seems as if allowing the same command to update the same row
more than once is just not desirable in general. It doesn't seem
necessary to bring low level arguments about cmin/cmax into it, nor
does it seem necessary to talk about things like the Halloween
problem. To me the best argument is also the simplest: who would want
us to allow it, and for what purpose?

I suppose that we might theoretically prefer to throw a cardinality
violation for DO NOTHING, but I don't see a way to do that without
locking rows and dirtying heap pages. If somebody were to argue that
we should make DO NOTHING lock rows and throw similar errors now then
I'd also disagree with them, but to a much lesser degree. I don't
think that this patch is a good idea.

-- 
Peter Geoghegan

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Aleksander Alekseev

Date:

09 February 2023, 10:06:04

Hi,

> Yes, and the fact is that cmin == cmax is something that we don't normally
> produce

Not sure if this is particularly relevant to this discussion but I
can't resist noticing that the heap doesn't even store cmin and
cmax... There is only HeapTupleHeaderData.t_cid and flags. cmin/cmax
are merely smoke and mirrors we use to trick a user.

And yes, the patch doesn't seem to break much mirrors:

```
=# create table t (a int unique, b int);
=# insert into t values (1,1), (1,2) on conflict (a) do update set b = 0;
=# SELECT xmin, xmax, cmin, cmax, * FROM t;
 xmin | xmax | cmin | cmax | a | b
------+------+------+------+---+---
  731 |    0 |    0 |    0 | 1 | 0
=# begin;
=# insert into t values (2,1), (2,2), (3,1) on conflict (a) do update set b = 0;
=# SELECT xmin, xmax, cmin, cmax, * FROM t;
 xmin | xmax | cmin | cmax | a | b
------+------+------+------+---+---
  731 |    0 |    0 |    0 | 1 | 0
  732 |    0 |    0 |    0 | 2 | 0
  732 |    0 |    0 |    0 | 3 | 1

=# insert into t values (2,1), (2,2), (3,1) on conflict (a) do update set b = 0;
=# SELECT xmin, xmax, cmin, cmax, * FROM t;
 xmin | xmax | cmin | cmax | a | b
------+------+------+------+---+---
  731 |    0 |    0 |    0 | 1 | 0
  732 |  732 |    1 |    1 | 2 | 0
  732 |  732 |    1 |    1 | 3 | 0

=# commit;
=# SELECT xmin, xmax, cmin, cmax, * FROM t;
 xmin | xmax | cmin | cmax | a | b
------+------+------+------+---+---
  731 |    0 |    0 |    0 | 1 | 0
  732 |  732 |    1 |    1 | 2 | 0
  732 |  732 |    1 |    1 | 3 | 0
```

> That's a spectactularly wrong argument in almost all cases. Unless you have a
> way to get to full branch coverage or use a model checker that basically does
> the same, testing isn't going to give you a whole lot of confidence that you
> haven't introduced bugs.

But neither will reviewing a lot of code...

> I've said my piece, as-is I vote to reject the patch.

Fair enough. I'm merely saying that rejecting a patch because it
doesn't include a TLA+ model is something novel :)

> I don't buy your argument about DO UPDATE needing to be brought into
> line with DO NOTHING. In any case I'm pretty sure that Tom's remarks
> in 2016 about a behavioral inconsistencies (which you cited) actually
> called for making DO NOTHING more like DO UPDATE -- not the other way
> around.

Interesting. Yep, we could use a bit of input from Tom on this one.

This of course would break backward compatibility. But we can always
invent something like:

```
INSERT INTO ..
ON CONFLICT DO [NOTHING|UPDATE .. ]
[ALLOWING|FORBIDDING] SELF CONFLICTS;
```

... if we really want to.

> problem. To me the best argument is also the simplest: who would want
> us to allow it, and for what purpose?

Good question.

This arguably has little use for application developers. As an
application developer you typically know your unique constraints and
using this knowledge you can rewrite the query as needed and add any
other accompanying logic.

However, extension developers, as an example, often don't know the
underlying unique constraints (more specifically, it's difficult to
look for them and process them manually) and often have to process any
garbage the application developer passes to an extension.

This of course is applicable not only to extensions, but to any
middleware between the DBMS and the application.

-- 
Best regards,
Aleksander Alekseev

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Aleksander Alekseev

Date:

09 February 2023, 10:16:20

Hi,

```
=# commit;
=# SELECT xmin, xmax, cmin, cmax, * FROM t;
 xmin | xmax | cmin | cmax | a | b
------+------+------+------+---+---
  731 |    0 |    0 |    0 | 1 | 0
  732 |  732 |    1 |    1 | 2 | 0
  732 |  732 |    1 |    1 | 3 | 0
```

Oops, you got me :) This of course isn't right - the xmax transaction
is committed but we still see the data, etc.

If we really are going to work on this, this part is going to require more work.

-- 
Best regards,
Aleksander Alekseev

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Andres Freund

Date:

09 February 2023, 10:28:18

Hi,

On 2023-02-09 13:06:04 +0300, Aleksander Alekseev wrote:
> > Yes, and the fact is that cmin == cmax is something that we don't normally
> > produce
> 
> Not sure if this is particularly relevant to this discussion but I
> can't resist noticing that the heap doesn't even store cmin and
> cmax... There is only HeapTupleHeaderData.t_cid and flags. cmin/cmax
> are merely smoke and mirrors we use to trick a user.

No, they're not just that. Yes, cmin/cmax aren't both stored on-disk, but if
both are needed, they *are* stored in-memory. We can do that because it's only
ever needed from within a transaction.

> > That's a spectactularly wrong argument in almost all cases. Unless you have a
> > way to get to full branch coverage or use a model checker that basically does
> > the same, testing isn't going to give you a whole lot of confidence that you
> > haven't introduced bugs.
> 
> But neither will reviewing a lot of code...

And yet my review did figure out that your patch would have visibility
problems, which you did end up having, as you noticed yourself downthread :)

> > I've said my piece, as-is I vote to reject the patch.
> 
> Fair enough. I'm merely saying that rejecting a patch because it
> doesn't include a TLA+ model is something novel :)

I obviously am not suggesting that (although some things could probably
benefit). Just that not having an example showing something working, isn't
sufficient to consider something suspicious OK.

And changes affecting heapam.c visibility semantics need extremely careful
review, I have the battle scars to prove that to be true :P.

Greetings,

Andres Freund

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

Aleksander Alekseev

Date:

09 February 2023, 10:42:36

Hi,

> And yet my review did figure out that your patch would have visibility
> problems, which you did end up having, as you noticed yourself downthread :)

Yep, this particular implementation turned out to be buggy.

>> I don't buy your argument about DO UPDATE needing to be brought into
>> line with DO NOTHING. In any case I'm pretty sure that Tom's remarks
>> in 2016 about a behavioral inconsistencies (which you cited) actually
>> called for making DO NOTHING more like DO UPDATE -- not the other way
>> around.
>
> Interesting. Yep, we could use a bit of input from Tom on this one.
>
> This of course would break backward compatibility. But we can always
> invent something like:
>
> ```
> INSERT INTO ..
> ON CONFLICT DO [NOTHING|UPDATE .. ]
> [ALLOWING|FORBIDDING] SELF CONFLICTS;
> ```
>
> ... if we really want to.

I suggest we discuss if we even want to support something like this
before processing further and then think about a particular
implementation if necessary.

One thing that occured to me during the discussion is that we don't
necessarily have to physically write one tuple at a time to the heap.
Alternatively we could use information about the existing unique
constraints and write only the needed tuples.

> However, extension developers, as an example, often don't know the
> underlying unique constraints (more specifically, it's difficult to
> look for them and process them manually) and often have to process any
> garbage the application developer passes to an extension.
>
> This of course is applicable not only to extensions, but to any
> middleware between the DBMS and the application.

This however is arguably a niche use case. So maybe we don't want to
spend time on this.

-- 
Best regards,
Aleksander Alekseev

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

From

"Gregory Stark (as CFM)"

Date:

01 March 2023, 19:42:14

On Thu, 9 Feb 2023 at 05:43, Aleksander Alekseev
<aleksander@timescale.com> wrote:
>
> >> I don't buy your argument about DO UPDATE needing to be brought into
> >> line with DO NOTHING. In any case I'm pretty sure that Tom's remarks
> >> in 2016 about a behavioral inconsistencies (which you cited) actually
> >> called for making DO NOTHING more like DO UPDATE -- not the other way
> >> around.
> >
> > Interesting. Yep, we could use a bit of input from Tom on this one.

I realize there are still unanswered conceptual questions about this
patch but with two votes against it seems unlikely to make much more
progress unless you rethink what you're trying to accomplish and
package it in a way that doesn't step on these more controversial
issues.

I'm going to mark the patch Returned With Feedback. If Tom or someone
else disagrees with Peter and Andres or has some new insights about
how to make it more palatable then we can always revisit that.

> > This of course would break backward compatibility. But we can always
> > invent something like:
> >
> > ```
> > INSERT INTO ..
> > ON CONFLICT DO [NOTHING|UPDATE .. ]
> > [ALLOWING|FORBIDDING] SELF CONFLICTS;
> > ```
> >
> > ... if we really want to.

Something like that might be what I mean about new insights though I
suspect this is overly complex. It looks like having the ON CONFLICT
UPDATE happen before the row is already inserted might simplify things
conceptually but then it might make the implementation prohibitively
complex.

-- 
Gregory Stark
As Commitfest Manager