query_is_distinct_for does not take into account set returning functions - Mailing list pgsql-hackers

From David Rowley
Subject query_is_distinct_for does not take into account set returning functions
Date
Msg-id CAApHDvrfVkH0P3FAooGcckBy7feCJ9QFanKLkX7MWsBcxY2Vcg@mail.gmail.com
Whole thread Raw
Responses Re: query_is_distinct_for does not take into account set returning functions  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Over here -> http://www.postgresql.org/message-id/6351.1404663344@sss.pgh.pa.us Tom noted that create_unique_path did not check for set returning functions.

Tom Wrote:
> I notice that create_unique_path is not paying attention to the question
> of whether the subselect's tlist contains SRFs or volatile functions.
> It's possible that that's a pre-existing bug.

I looked at this a bit and I can confirm that it does not behave as it should do. Take the following as an example:

create table x (id int primary key);
create table y (n int not null);

insert into x values(1);
insert into y values(1);

select * from x where (id,id) in(select n,generate_series(1,2) / 10 + 1 g from y);
 id
----
  1
(1 row)

select * from x where (id,id) in(select n,generate_series(1,2) / 10 + 1 g from y group by n);
 id
----
  1
  1
(2 rows)

The 2nd query does group by n, so query_is_distinct_for returns true, therefore the outer query think's it's ok to perform an INNER JOIN rather than a SEMI join, which is this case produces an extra record.

I think we should probably include the logic to test for set returning functions into query_is_distinct_for.

The attached fixes the problem.

Regards

David Rowley
Attachment

pgsql-hackers by date:

Previous
From: Ashoke
Date:
Subject: Re: Modifying update_attstats of analyze.c for C Strings
Next
From: David Rowley
Date:
Subject: Re: Allowing join removals for more join types