Re: [HACKERS] having and union in v7beta - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] having and union in v7beta
Date
Msg-id 514.951853793@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] having and union in v7beta  (Jose Soares <jose@sferacarta.com>)
List pgsql-hackers
Jose Soares <jose@sferacarta.com> writes:
> I tried the following query :

> select * from comuni where nome in (
>      select nome from comuni group by nome having 1 < count(nome)
>      );

> on the above table populated with 8342 rows, PostgreSQL begins searching
> and I wait for hours without any result.

I'd expect that to be pretty slow, since it's going to execute the inner
select for every tuple examined by the outer select.  Shouldn't be any
worse than 6.5 though.  IN (sub-SELECT) has always been slow.

The real solution is to figure out how to do this kind of thing via
joins, but that will have to wait for the fabled querytree redesign.

I have been toying with the notion of sticking a MATERIALIZE node
into the plan tree when we have an IN sub-select and the sub-plan is
complicated, but has no parameters passed from the upper plan.
(Not sure yet how complicated is complicated enough, but a plan that
requires sorting or indexscanning should qualify.)  The MATERIALIZE
node would run the sub-plan just once and stash the output tuples in
a temp table; then we'd only need a simple scan of the temp table for
each outer tuple.  I think that would improve the speed of IN
sub-SELECTs by a useful amount in many cases, and it'd be a lot easier
than the "real" solution.

However, I'm not sure it's a good idea to do this when we've already
started beta.  Should I try it, or leave it alone until 7.1?  By 7.1
it might be moot...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Vince Vielhaber
Date:
Subject: Re: [HACKERS] Re: NOT {NULL|DEFERRABLE} (was: bug in 7.0)
Next
From: The Hermit Hacker
Date:
Subject: Re: [HACKERS] Re: NOT {NULL|DEFERRABLE} (was: bug in 7.0)