Thread: [PATCH] Deferrable unique constraints vs join removal -- bug?

[PATCH] Deferrable unique constraints vs join removal -- bug?

From
Marti Raudsepp
Date:
Hi list,

PostgreSQL 9.0 introduced the "join removal" feature for cases where
an left join can't return more than one output row. This feature
relies on checking whether a UNIQUE constraint exists on the joined
column in the referenced table.

However, there was another new feature in 9.0: deferrable unique
constraints. It seems that the join removal code currently doesn't
check whether the constraint is deferrable or not. Since deferrable
unique constraints may contain duplicate rows in the course of a
transaction, join removal changes the results returned from a query.

This probably doesn't affect many real-world applications, but it
seems wrong that a performance feature can affect results returned by
a query.

Test case:

create table uniq (i int unique deferrable initially deferred);
begin;
insert into uniq values(1),(1);
select count(*) from uniq a left join uniq b using (i);
 count
-------
     2

An inner join performs as expected:
marti=# select count(*) from uniq a inner join uniq b using (i);
 count
-------
     4

----
Attached is a patch with an attempt to fix it. I'm not at all sure
this is the right approach, but it seems the cheapest way to make sure
is walk through *all* rows in pg_constraint for the given table, since
there is no index on pg_constraint.conindid. I'm not sure whether the
cost of this check outweighs the usefulness of this patch.

Catalog changes are a no-no for backported patches, right?
Or should I just go ahead and create this index, and not worry about
backporting?

I'm also adding lots of includes to this file. Maybe
unique_index_is_consistent() should be moved to another file instead?

While passing by, I also added an unrelated check to
check_functional_grouping() for the validity of the constraint. This
isn't necessary for now (unique constraints are always valid), but
seems useful just in case this changes in the future.

Regards,
Marti

Attachment

Re: [PATCH] Deferrable unique constraints vs join removal -- bug?

From
Robert Haas
Date:
On Wed, Oct 19, 2011 at 7:35 AM, Marti Raudsepp <marti@juffo.org> wrote:
> This probably doesn't affect many real-world applications, but it
> seems wrong that a performance feature can affect results returned by
> a query.
>
> Test case:
>
> create table uniq (i int unique deferrable initially deferred);
> begin;
> insert into uniq values(1),(1);
> select count(*) from uniq a left join uniq b using (i);
>  count
> -------
>     2

Yuck.  Well, that's certainly a bug.  What's weird is that I thought
we had put logic into the join removal code to ignore deferrable
constraints.  Apparently not.  I think maybe what we should do is add
an "immediate" field to IndexOptInfo, mirroring the existing unique
flag, and have get_relation_info() populate it from indimmediate, and
then make relation_has_unique_index() disqualify any non-immediate
index.

has_unique_index() arguably needs a similar fix, although at present
that appears to be used for only statistic purposes, so maybe it's OK. A comment update might be a good idea, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] Deferrable unique constraints vs join removal -- bug?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Yuck.  Well, that's certainly a bug.  What's weird is that I thought
> we had put logic into the join removal code to ignore deferrable
> constraints.

Yeah, I thought we had too.
        regards, tom lane


Re: [PATCH] Deferrable unique constraints vs join removal -- bug?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Yuck.  Well, that's certainly a bug.  What's weird is that I thought
> we had put logic into the join removal code to ignore deferrable
> constraints.  Apparently not.

I poked around a bit more and could not find any evidence that we'd
ever done that.  Ah well.

> I think maybe what we should do is add
> an "immediate" field to IndexOptInfo, mirroring the existing unique
> flag, and have get_relation_info() populate it from indimmediate, and
> then make relation_has_unique_index() disqualify any non-immediate
> index.

Yeah, this seems like the right fix.  I considered redefining the unique
flag to mean indisunique && indimmediate, but that's wrong because of:

> has_unique_index() arguably needs a similar fix, although at present
> that appears to be used for only statistic purposes, so maybe it's OK.

Yes, since this is meant for statistical purposes, I think it's
appropriate for it to disregard indimmediate.

>   A comment update might be a good idea, though.

Or we could add a parameter to have the caller indicate which behavior
is wanted.  But for now I think a comment is enough.
        regards, tom lane


Re: [PATCH] Deferrable unique constraints vs join removal -- bug?

From
Marti Raudsepp
Date:
On Sun, Oct 23, 2011 at 06:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I think maybe what we should do is add
>> an "immediate" field to IndexOptInfo, mirroring the existing unique
>> flag, and have get_relation_info() populate it from indimmediate, and
>> then make relation_has_unique_index() disqualify any non-immediate
>> index.
>
> Yeah, this seems like the right fix.

Oh, that sounds pretty obvious now that you mention it. :)

I will try to come up with a new patch in a few days (haven't had too
much time lately).

Regards,
Marti


Re: [PATCH] Deferrable unique constraints vs join removal -- bug?

From
Tom Lane
Date:
Marti Raudsepp <marti@juffo.org> writes:
> On Sun, Oct 23, 2011 at 06:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, this seems like the right fix.

> Oh, that sounds pretty obvious now that you mention it. :)

> I will try to come up with a new patch in a few days (haven't had too
> much time lately).

Oh, I did it already.
        regards, tom lane


Re: [PATCH] Deferrable unique constraints vs join removal -- bug?

From
Marti Raudsepp
Date:
On Sun, Oct 23, 2011 at 21:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I will try to come up with a new patch in a few days (haven't had too
>> much time lately).
>
> Oh, I did it already.

Cool.

I noticed now that you didn't add a regression test for this fix.
Perhaps you could reuse the test from my patch, which also tests for
plan invalidation.

Regards,
Marti