From: Laurenz Albe
Subject: Re: Bad estimates
Date: ,
Msg-id: 1511363394.2553.68.camel@cybertec.at
(view: Whole thread, Raw)
In response to: Bad estimates  (Artur Zając)
List: pgsql-performance

Tree view

Bad estimates  (Artur Zając, )
 Re: Bad estimates  (Don Seiler, )
  RE: Bad estimates  ("Alex Ignatov", )
 Re: Bad estimates (DEFAULT_UNK_SEL)  (Justin Pryzby, )
 Re: Bad estimates  (Tom Lane, )
  RE: Bad estimates  (Artur Zając, )
 Re: Bad estimates  (Laurenz Albe, )

Artur Zając wrote:
> We have table created like this:
> 
> CREATE TABLE xyz AS SELECT generate_series(1,10000000,1) AS gs;
> 
> Now:
> 
> explain analyze select * from xyz where gs&1=1;

>  Seq Scan on xyz  (cost=0.00..260815.38 rows=68920 width=4)
>                   (actual time=0.044..2959.728 rows=5000000 loops=1)
>    Filter: ((gs & 1) = 1)
>    Rows Removed by Filter: 5000000
[...]
> And one more clause:
> 
> explain analyze select * from xyz where gs&1=1 and gs&2=2 and gs&4=4;

>  Seq Scan on xyz  (cost=0.00..398655.62 rows=2 width=4)
>                   (actual time=0.052..3329.422 rows=1250000 loops=1)
>    Filter: (((gs & 1) = 1) AND ((gs & 2) = 2) AND ((gs & 4) = 4))
>    Rows Removed by Filter: 8750000

> As we can see estimates differs significally from the actual records count -
> only three clauses are reducing estimated number of records from 10000000 to
> 2.
> 
> I noticed that each additional clause reduces the number about 200 times and
> define DEFAULT_NUM_DISTINCT is responsible for this behaviur.
> 
> I think that this variable should be lower or maybe estimation using
> DEFAULT_NUM_DISTTINCT should be done once per table.

The problem is that the expression "gs & 1" is a black box for the
optimizer; it cannot estimate how selective the condition is and falls
back to a default value that is too low.

You can create an index to
a) improve the estimate
and
b) speed up the queries:

CREATE INDEX ON xyz ((gs & 1), (gs & 2), (gs & 4));

Don't forget to ANALYZE afterwards.

Yours,
Laurenz Albe



pgsql-performance by date:

From: Patrick KUI-LI
Date:
Subject: Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade
From: Dmitry Shalashov
Date:
Subject: Re: Query became very slow after 9.6 -> 10 upgrade