Re: [HACKERS] Discussion on missing optimizations - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: [HACKERS] Discussion on missing optimizations
Date
Msg-id 6d8d660f-f278-1074-5c06-67ea250d1f08@2ndquadrant.com
Whole thread Raw
In response to Re: [HACKERS] Discussion on missing optimizations  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Discussion on missing optimizations
Re: [HACKERS] Discussion on missing optimizations
List pgsql-hackers
On 07/10/17 19:59, Tom Lane wrote:
> Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
>> On 07/10/17 18:15, Tom Lane wrote:
>>> No, I'm afraid you didn't read that comment closely enough.  This will
>>> flat out fail for cases like "select ... where x=x order by x", because
>>> there will already be a single-element EC for x and so the clause will
>>> just disappear entirely.  If that doesn't happen, then instead you're
>>> creating an EC with duplicate entries, which is an idea I seriously
>>> dislike; the semantics of that don't seem clear at all to me.
> 
>> Hmm it did not disappear (and worked fine in SQL level tests).
> 
> I may not be correctly remembering what the query would need to look
> like for there to be single-element ECs in existence at this point; but
> I surely would not like this code to assume that none will exist till
> later.
> 
>> I don't
>> think I fully understand the "EC with duplicate entries" part and what's
>> the problem with it so I'll defer to your wisdom there.
> 
> Well, as one example, assume that we use your patch and consider what
> happens with
>     where x = y and x = x
> vs
>     where x = x and x = y
> 
> In the first case we end up with an EC that's just {x,y}, because the
> second process_equivalence() will find both sides in the same EC and
> conclude that the second clause is redundant.  (Which it is, if the
> equality operators have the same notion of what to do with nulls.)
> In the second case we end up with an EC containing {x,x,y}, which
> at minimum will result in emitting redundant generated quals.  I'm
> not sure if it could have any worse consequences than that, but I'm
> not sure it couldn't either.  But this is bogus in any case, because
> those WHERE clauses surely should produce identical results.
> 
> Looking further ahead, if ECs have to support being multisets rather
> than pure sets, that would put a crimp in ever improving this logic to
> use a smarter UNION-FIND algorithm.  (I've not yet seen queries where the
> number of EC members is large enough to make that a serious issue, but
> I think it could happen, particularly with inheritance/partitioning.)
> 

Okay, that makes sense, thanks for explanation. Your patch is the way to
go then.

>>> This passes the smell test for me in the sense of not adding any
>>> significant number of planner cycles except when the weird case occurs.
>>> It leaves something on the table in that there are some situations
>>> where X=X could be converted, but we won't do it because we don't see
>>> the clause as being a potential EC (because it's not at top level),
>>> as in the second new regression test case below.  I think that's
>>> probably all right; I don't see any way to be more thorough without
>>> adding a lot of new cycles all the time, and I don't believe this is
>>> worth that.
> 
>> My code had same issue. I think going deeper would require quite a bit
>> of new code (and cycles) for something that's even less likely to happen
>> than simple X=X while the current patch is quite cheap win.
> 
> Yeah.  I'm not really convinced it's a big win, but it's so cheap we
> might as well do it.  The only case where it would expend cycles and
> fail to give an improvement is if we have X=X with a non-strict operator,
> which I think is a case that never arises in practice at present,
> because btree effectively assumes that all btree operators are strict.
> (Someday we might want to relax that, though, which is why I don't
> want to let the planner just assume it.)
> 

In real-life probably not, but seems like these small bits can be useful
marketing wise, given articles like the one which started this. And it
should not hurt anything either.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Horrible CREATE DATABASE Performance in High Sierra
Next
From: Jack Christensen
Date:
Subject: [HACKERS] Prepared statements assume text type in PG10