Re: slow IN() clause for many cases - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: slow IN() clause for many cases
Date
Msg-id 1133302026.2906.454.camel@localhost.localdomain
Whole thread Raw
In response to Re: slow IN() clause for many cases  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: slow IN() clause for many cases  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Mon, 2005-10-17 at 12:49 +0100, Simon Riggs wrote:
> On Fri, 2005-10-14 at 19:09 -0400, Tom Lane wrote:
> > I wrote:
> > > I'm thinking that IN should be
> > > converted to a ScalarArrayOpExpr, ie
> > 
> > >     x = ANY (ARRAY[val1,val2,val3,val4,...])
> > 
> > Actually, there is one little thing in the way of doing this: it'll
> > fail if any of the IN-list elements are NULL, because we have not got
> > support for arrays with null elements.  So we'd have to fix that first.
> 
> You'd also need to consider how this effects partial indexes and
> constraint exclusion. Not much of an issue, but an extra case to handle
> in the predicate proving code.
> 
> = = =
> 
> Just had a case where using an IN list was quicker than using a join
> because it allowed an index lookup to occur. There is also some clear
> mileage in transforming this type of query to a more plannable form:
> 
> select * from bigtable where word IN (
> select word from customer_word where customer = 6)
> 
> i.e. where the values for the IN clause are evaluated at run time,
> rather than at plan time.

Do you think we'll be able to generate a single ScalarArrayOpExpr from a
small subselect and pass it through as an indexable expression?

I'm guessing its not lost on you that this would give a Star join like
capability, when joining multiple dimension tables to a large Fact
table.

e.g.

Select * From Sales where month IN (
select month from time_dimension where FinYear = 2005 and Quarter = 3)
...

Having taught predtest.c about ScalarArrayOpExpr means that would allow
this to work with constraint exclusion.

So that solves the how-to-join-AND-partition problem I've been
struggling with: don't join, transform. Very cool.

Best Regards, Simon Riggs





pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Using multi-row technique with COPY
Next
From: Simon Riggs
Date:
Subject: Re: ice-broker scan thread