Home > mailing lists

Re: Please help! Query jumps from 1s -> 4m - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Please help! Query jumps from 1s -> 4m
Date	May 5, 2020 21:42:48
Msg-id	26352.1588714968@sss.pgh.pa.us Whole thread Raw
In response to	Re: Please help! Query jumps from 1s -> 4m (James Thompson <james@jthompson.dev>)
Responses	Re: Please help! Query jumps from 1s -> 4m
List	pgsql-performance

Tree view

James Thompson <james@jthompson.dev> writes:
> The slowness occurs when the prepared statement changes to a generic plan.

> Initial plan:
> ->  Index Only Scan using table1_typea_include_uniqueid_col16_idx on table1
> table1alias2  (cost=0.56..2549.70 rows=70 width=36) (actual
> time=1.901..45.256 rows=65000 loops=1)
>     Output: table1alias2.uniqueid
>     Index Cond: ((table1alias2.col20 = '12345'::bigint) AND (table1alias2.
> col8 = ANY ('{c5986b02-3a02-4639-8147-f286972413ba,...
> 98ed24b1-76f5-4b0e-bb94-86cf13a4809c}'::text[])))
>     Heap Fetches: 10
>     Buffers: shared hit=5048

> after 5 executions of the statement:
> ->  Index Only Scan using table1_typea_include_uniqueid_col16_idx on table1
> table1alias2  (cost=0.56..17.23 rows=1 width=36) (actual
> time=125.344..126877.822 rows=65000 loops=1)
>     Output: table1alias2.uniqueid
>     Index Cond: (table1alias2.col20 = $1001)
>     Filter: ((table1alias2.col8)::text = ANY ((ARRAY[$1, ...,
> $1000])::text[]))
>     Rows Removed by Filter: 2670023
>     Heap Fetches: 428
>     Buffers: shared hit=45933 read=42060 dirtied=4

Yeah, this is a dynamic we've seen before.  The rowcount estimate, and
hence the cost estimate, for the plan with explicit parameter values is
way off; but the estimate for the generic plan is even more way off,
causing the system to falsely decide that the latter is cheaper.

I've speculated about refusing to believe generic cost estimates if they are
more than epsilon less than the concrete cost estimate, but it's not quite
clear how that should work or whether it'd have its own failure modes.

The one thing that is totally clear is that these rowcount estimates are
crappy.  Can you improve them by increasing the stats target for that
table?  Maybe with less-garbage-y inputs, the system would make the right
plan choice here.

            regards, tom lane

pgsql-performance by date:

From: Justin Pryzby
Date: 05 May 2020, 21:35:54
Subject: Re: Please help! Query jumps from 1s -> 4m

From: Arya F
Date: 06 May 2020, 00:15:14
Subject: Re: 600 million rows of data. Bad hardware or need partitioning?

Re: Please help! Query jumps from 1s -> 4m - Mailing list pgsql-performance

Previous

Next