Re: [HACKERS] Making clausesel.c Smarter - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: [HACKERS] Making clausesel.c Smarter |
Date | |
Msg-id | CAKJS1f_CC59STaKEqV4vsN4KUVRA6yYt8sspmm93R=+R28fnhA@mail.gmail.com Whole thread Raw |
In response to | [HACKERS] Making clausesel.c Smarter (David Rowley <david.rowley@2ndquadrant.com>) |
Responses |
Re: [HACKERS] Making clausesel.c Smarter
|
List | pgsql-hackers |
On 4 April 2017 at 13:35, Claudio Freire <klaussfreire@gmail.com> wrote: > On Mon, Apr 3, 2017 at 9:19 PM, Claudio Freire <klaussfreire@gmail.com> wrote: > If you ask me, I'd just leave: > > + if (rqlist->hibound == DEFAULT_INEQ_SEL || rqlist->lobound == > DEFAULT_INEQ_SEL) > + { > + /* No point in checking null selectivity, DEFAULT_INEQ_SEL > implies we have no stats */ > + s2 = DEFAULT_RANGE_INEQ_SEL; > + } > + else > + { > ... > + s2 = rqlist->hibound + rqlist->lobound - 1.0; > + nullsel = nulltestsel(root, IS_NULL, rqlist->var, varRelid); > + notnullsel = 1.0 - nullsel; > + > + /* Adjust for double-exclusion of NULLs */ > + s2 += nullsel; > + if (s2 <= 0.0) { > + if (s2 <= (-1.0e-4 * notnullsel - 1.0e-6)) > + { > + /* Most likely contradictory clauses found */ > + s2 = (1.0e-10 < notnullsel) ? 1.0e-10 : notnullsel; > + } > + else > + { > + /* Could be a rounding error */ > + s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel; > + } > + } > + } > > Where (-1.0e-4 * notnullsel - 1.0e-6) is just a very rough (and overly > cautious) estimation of the amount of rounding error that could be > there with 32-bit floats. > > The above assumes a non-DEFAULT_INEQ_SEL value in lobound/hibound is > an estimation based on stats, maybe approximate, but one that makes > sense (ie: preserves the monotonicity of the CPF), and as such > negative values are either sign of a contradiction or rounding error. > The previous code left the possibility of negatives coming out of > default selectivities creeping up on non-DEFAULT_INEQ_SEL estimates, > but that doesn't seem possible coming out of scalarineqsel. I notice this does change the estimates for contradictory clauses such as: create table a (value int); insert into a select x/100 from generate_Series(1,10000) x; analyze a; explain analyze select * from a where value between 101 and -1; We now get: QUERY PLAN --------------------------------------------------------------------------------------------- Seq Scan on a (cost=0.00..195.00 rows=1 width=4) (actual time=1.904..1.904 rows=0 loops=1) Filter: ((value >= 101) AND (value <= '-1'::integer)) Rows Removed by Filter: 10000 Planning time: 0.671 ms Execution time: 1.950 ms (5 rows) where before we'd get: QUERY PLAN ---------------------------------------------------------------------------------------------- Seq Scan on a (cost=0.00..195.00 rows=50 width=4) (actual time=0.903..0.903 rows=0 loops=1) Filter: ((value >= 101) AND (value <= '-1'::integer)) Rows Removed by Filter: 10000 Planning time: 0.162 ms Execution time: 0.925 ms (5 rows) Which comes from the 10000 * 0.005 selectivity estimate from tuples * DEFAULT_RANGE_INEQ_SEL I've attached a patch to this effect. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: