Re: huge disparities in =/IN/BETWEEN performance - Mailing list pgsql-sql

From Tom Lane
Subject Re: huge disparities in =/IN/BETWEEN performance
Date
Msg-id 5625.1171060719@sss.pgh.pa.us
Whole thread Raw
In response to Re: huge disparities in =/IN/BETWEEN performance  ("George Pavlov" <gpavlov@mynewplace.com>)
List pgsql-sql
"George Pavlov" <gpavlov@mynewplace.com> writes:
> I somehow wish I could tell the optimizer to
> first figure out which stuff_ids are related to the user_id that is
> being asked for and then look ONLY those up in the stuff_events table
> using the index on stuff_id. 

This is not really an optimizer problem, or at least not just an
optimizer problem.  The type of plan I think you are wishing for is what
the source code calls a "nestloop with inner index scan", and that
terminology should tip you off that it's only considered when the inner
relation is just a simple indexscannable table.  GROUP BY subqueries
need not apply :-(.

I've been speculating recently about how this situation might be
improved, but I fear it will require nontrivial executor changes along
with planner changes.  The executor's present mechanism for passing
variable values from the outer plan to the inner is a hack that only
really works for indexscans.  I got it to work for inheritance cases
too, recently, but that's about as far as it can be pushed.  I think
it might be possible to get rid of it and use the more-recently-invented
subplan parameter mechanism, but I haven't worked out the details.
(And I know that the Greenplum crowd would like to get rid of subplan
parameters, so I'm not sure this idea will go over well anyway.)  The
planner changes needed will be pretty wide-ranging too, likely.

This might happen for 8.4 but I wouldn't promise it for 8.3.
        regards, tom lane


pgsql-sql by date:

Previous
From: "George Pavlov"
Date:
Subject: Re: huge disparities in =/IN/BETWEEN performance
Next
From: "Franz Stuetzle"
Date:
Subject: unsubscribe