Home > mailing lists

Re: Use extended statistics to estimate (Var op Var) clauses - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Use extended statistics to estimate (Var op Var) clauses
Date	August 20, 2021 18:56:43
Msg-id	CA+TgmobiQtcme20UH9TdYp5iE0Oc8M3nGMkz4HRMcKfgwfsRxQ@mail.gmail.com Whole thread Raw
In response to	Re: Use extended statistics to estimate (Var op Var) clauses (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses	Re: Use extended statistics to estimate (Var op Var) clauses
List	pgsql-hackers

Tree view

On Fri, Aug 20, 2021 at 2:21 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> After looking at this for a while, it's clear the main issue is handling
> of clauses referencing the same Var twice, like for example (a = a) or
> (a < a). But it's not clear to me if this is something worth fixing, or
> if extended statistics is the right place to do it.
>
> If those clauses are worth the effort, why not to handle them better
> even without extended statistics? We can easily evaluate these clauses
> on per-column MCV, because they only reference a single Var.

+1.

It seems to me that what we ought to do is make "a < a", "a > a", and
"a != 0" all have an estimate of zero, and make "a <= a", "a >= a",
and "a = a" estimate 1-nullfrac. The extended statistics mechanism can
just ignore the first three types of clauses; the zero estimate has to
be 100% correct. It can't necessarily ignore the second three cases,
though. If the query says "WHERE a = a AND b = 1", "b = 1" may be more
or less likely given that a is known to be not null, and extended
statistics can tell us that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Mark Dilger
Date: 20 August 2021, 18:42:09
Subject: Re: Minor pg_amcheck fixes spotted while reading code

From: Peter Geoghegan
Date: 20 August 2021, 19:12:47
Subject: Re: The Free Space Map: Problems and Opportunities

Re: Use extended statistics to estimate (Var op Var) clauses - Mailing list pgsql-hackers

Previous

Next