Re: Parameterized paths vs index clauses extracted from OR clauses - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Parameterized paths vs index clauses extracted from OR clauses
Date
Msg-id CA+TgmoZOrNuRpAcu17jv-+bkvAb-EoS0T6xG8VkHadHWtaL=iw@mail.gmail.com
Whole thread Raw
In response to Re: Parameterized paths vs index clauses extracted from OR clauses  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Parameterized paths vs index clauses extracted from OR clauses  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Mar 5, 2013 at 3:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Well, the point is not so much about whether it's an improvement as that
> 9.2's current behavior is a regression from 9.1 and earlier.  People may
> not like changes in minor releases, but they don't like regressions
> either.

That's true, but I'm still worried that we're just moving the
unhappiness around from one group of people to another group of
people, and I don't have a lot of confidence about which group is
larger.

>>> A downside of this approach is that to preserve
>>> the same-number-of-rows assumption, we'd end up having to enforce the
>>> extracted clauses as filter clauses in parameterized paths, even if
>>> they'd not proved to be of any use as index quals.
>
>> I'm not sure I fully grasp why this is a downside.  Explain further?
>
> Because we'd be checking redundant clauses.  You'd get something like
>
>         Nested Loop
>                 Filter: (foo OR (bar AND baz))
>
>                 ... some outer scan here ...
>
>                 Index Scan:
>                         Filter: (foo OR bar)
>
> If "foo OR bar" is useful as an indexqual condition in the inner scan,
> that's one thing.  But if it isn't, the cycles expended to check it in
> the inner scan are possibly wasted, because we'll still have to check
> the full original OR clause later.  It's possible that the filter
> condition removes enough rows from the inner scan's result to justify
> the redundant checks, but it's at least as possible that it doesn't.

Yeah, that's pretty unappealing.  It probably doesn't matter much if
foo is just a column reference, but what if it's an expensive
function?  For that matter, what if it's a volatile function that we
can't execute twice without changing the results?

>> Since there's little point in using a paramaterized path in the first
>> place unless it enables you to drastically reduce the number of rows
>> being processed, I would anticipate that maybe the consequences aren't
>> too bad, but I'm not sure.
>
> Yeah, we could hope that the inner scan is already producing few enough
> rows that it doesn't matter much.  But I think that we'd end up checking
> the added qual even in a non-parameterized scan; there's no mechanism
> for pushing quals into the general qual lists and then retracting them
> later.  (Hm, maybe what we need is a marker for "enforce this clause
> only if you feel like it"?)

Not sure I get the parenthesized bit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: 9.2.3 crashes during archive recovery
Next
From: Michael Paquier
Date:
Subject: Re: Support for REINDEX CONCURRENTLY