Re: Row count estimation bug in BETWEEN? - Mailing list pgsql-general

From Yaroslav
Subject Re: Row count estimation bug in BETWEEN?
Date
Msg-id 1434228710904-5853725.post@n5.nabble.com
Whole thread Raw
In response to Re: Row count estimation bug in BETWEEN?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Row count estimation bug in BETWEEN?  (Yaroslav <ladayaroslav@yandex.ru>)
Re: Row count estimation bug in BETWEEN?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Tom Lane-2 wrote
> PG doesn't try to estimate inequalities exactly, because it usually
> doesn't make enough of a difference to matter.  Currently we don't
> even bother to distinguish say ">" from ">=" for estimation purposes,
> though certainly we would need to in order to deal with zero-width ranges
> with any great amount of precision.

Thank you for your answer!

I'm sorry, but after looking into documentation and sources (scalarineqsel
function in selfuncs.c, clauselist_selectivity and addRangeClause functions
in
clausesel.c) and experimenting a little I've got an impression that
PostgreSQL
actually bothers to distinguish ">" from ">=" for estimation purposes
sometimes (probably, when MCV is used), but in my example it uses histogram
and indeed doesn't distinguish them.

My simple test (using MCVs) is below:
-----
CREATE TABLE t2(n int);
INSERT INTO t2(n) VALUES (0),(0),(0),(0),(1),(1),(1),(1),(2),(2),(2),(2);
ANALYZE t2;

EXPLAIN SELECT * FROM t2 WHERE n > 0 AND n < 2
-- rows=4
EXPLAIN SELECT * FROM t2 WHERE n > 0 AND n < 2
-- rows=12
------

Looking further, I found ineq_histogram_selectivity function in selfuncs.c,
and this fragment seems relevant:
-----
/*
 * We have values[i-1] <= constant <= values[i].
 *
 * Convert the constant and the two nearest bin boundary
 * values to a uniform comparison scale, and do a linear
 * interpolation within this bin.
 */
<skip>
binfrac = (val - low) / (high - low);
-----
And now I'm stuck. Can ">" operators can be distinguished from ">="
operators at this point?




-----
WBR, Yaroslav Schekin.
--
View this message in context: http://postgresql.nabble.com/Row-count-estimation-bug-in-BETWEEN-tp5853687p5853725.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Row count estimation bug in BETWEEN?
Next
From: Yaroslav
Date:
Subject: Re: Row count estimation bug in BETWEEN?