Re: Parameterized-path cost comparisons need some work - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Parameterized-path cost comparisons need some work
Date
Msg-id CA+TgmoY05or-GS3rKPrehVCin7VS6+yYRR0KyQLV_ZfTXK7-4A@mail.gmail.com
Whole thread Raw
In response to Re: Parameterized-path cost comparisons need some work  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Parameterized-path cost comparisons need some work
List pgsql-hackers
On Wed, Feb 29, 2012 at 6:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Well, my "evidence" is that a parameterized path should pretty much
>> always include a paramaterized path somewhere in there - otherwise,
>> what is parameterization doing for us?
>
> Well, yes, we know that much.

I didn't write what I meant to write there.  I meant to say: a
parameterized path is presumably going to contain a parameterized
*index scan* somewhere within.  So somewhere we're going to have
something of the form

-> Index Scan blah on blah   Index Cond: someattr = $1

And if that path weren't parameterized, we'd have to read the whole
relation, either with a full index scan, or a sequential scan.  Or, I
mean, maybe there's a filter condition, so that no path needs to
retrieve the *whole* relation, but even there the index cond is on top
of that, and it's probably doing something, though I suppose you're
right that there might be cases where it doesn't.

>> And that's going to reduce the
>> row count.  I may be missing something, but I'm confused as to why
>> this isn't nearly tautological.
>
> We don't know that --- I will agree it's likely, but that doesn't make
> it so certain that we can assume it without checking.  A join condition
> won't necessarily eliminate any rows.
>
> (... thinks about that for awhile ...)  One thing we could possibly do
> is have indxpath.c arbitrarily reject parameterizations that don't
> produce a smaller estimated number of rows than an unparameterized scan.
> Admittedly, this still doesn't *prove* the assumption for join
> relations, but maybe it brings the odds to where it's okay for add_path
> to make such an assumption.

That seems to make sense.

> (... thinks some more ...)  No, that doesn't get us there, because that
> doesn't establish that a more-parameterized path produces fewer rows
> than some path that requires less parameterization, yet not none at
> all.  You really want add_path carrying out those comparisons.  In your
> previous example, it's entirely possible that path D is dominated by B
> or C because of poor choices of join quals.

I'm not following this part.  Can you explain further?  It seems to me
at any rate that we could get pretty far if we could just separate
parameterized paths and unparameterized paths into separate buckets.
Even if we have to do some extra work when comparing parameterized
paths *to each other*, we'd gain a fair amount by avoiding comparing
any of them with the unparameterized paths.  Or at least, I hope so.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: pg_upgrade --logfile option documentation
Next
From: Robert Haas
Date:
Subject: performance results on IBM POWER7