Home > mailing lists

Re: Bad estimates (DEFAULT_UNK_SEL) - Mailing list pgsql-performance

From	Justin Pryzby
Subject	Re: Bad estimates (DEFAULT_UNK_SEL)
Date	November 22, 2017 17:57:13
Msg-id	20171122145713.GU5668@telsasoft.com Whole thread Raw
In response to	Bad estimates (Artur Zając <azajac@ang.com.pl>)
List	pgsql-performance

Tree view

On Wed, Nov 22, 2017 at 03:29:54PM +0100, Artur Zając wrote:
> CREATE TABLE xyz AS SELECT generate_series(1,10000000,1) AS gs;
> 
> db=# explain analyze select * from xyz where gs&1=1;
>  Seq Scan on xyz  (cost=0.00..260815.38 rows=68920 width=4) (actual time=0.044..2959.728 rows=5000000 loops=1)
...
> newrr=# explain analyze select * from xyz where gs&1=1 and gs&2=2 and gs&4=4;
>  Seq Scan on xyz  (cost=0.00..398655.62 rows=2 width=4) (actual time=0.052..3329.422 rows=1250000 loops=1)

> I noticed that each additional clause reduces the number about 200 times and
> define DEFAULT_NUM_DISTINCT is responsible for this behaviur.

I think it's actually:

src/include/utils/selfuncs.h-/* default selectivity estimate for boolean and null test nodes */
src/include/utils/selfuncs.h-#define DEFAULT_UNK_SEL                    0.005

..which is 1/200.

Note, you can do this, which helps a bit by collecting stats for the index
expr:

postgres=# CREATE INDEX ON xyz((gs&1));
postgres=# ANALYZE xyz;
postgres=# explain analyze SELECT * FROM xyz WHERE gs&1=1 AND gs&2=2 AND gs&4=4;Bitmap Heap Scan on xyz
(cost=91643.59..259941.99rows=124 width=4) (actual time=472.376..2294.035 rows=1250000 loops=1)  Recheck Cond: ((gs &
1)= 1)  Filter: (((gs & 2) = 2) AND ((gs & 4) = 4))  Rows Removed by Filter: 3750000  Heap Blocks: exact=44248  ->
BitmapIndex Scan on xyz_expr_idx  (cost=0.00..91643.55 rows=4962016 width=0) (actual time=463.477..463.477 rows=5000000
loops=1)       Index Cond: ((gs & 1) = 1)
 

Justin

pgsql-performance by date:

From: Dmitry Shalashov
Date: 22 November 2017, 17:51:22
Subject: Re: Query became very slow after 9.6 -> 10 upgrade

From: Tom Lane
Date: 22 November 2017, 18:01:57
Subject: Re: Bad estimates

Re: Bad estimates (DEFAULT_UNK_SEL) - Mailing list pgsql-performance

Previous

Next