Re: Please help! Query jumps from 1s -> 4m - Mailing list pgsql-performance

From Tom Lane
Subject Re: Please help! Query jumps from 1s -> 4m
Date
Msg-id 26352.1588714968@sss.pgh.pa.us
Whole thread Raw
In response to Re: Please help! Query jumps from 1s -> 4m  (James Thompson <james@jthompson.dev>)
Responses Re: Please help! Query jumps from 1s -> 4m  (James Thompson <james@jthompson.dev>)
List pgsql-performance
James Thompson <james@jthompson.dev> writes:
> The slowness occurs when the prepared statement changes to a generic plan.

> Initial plan:
> ->  Index Only Scan using table1_typea_include_uniqueid_col16_idx on table1
> table1alias2  (cost=0.56..2549.70 rows=70 width=36) (actual
> time=1.901..45.256 rows=65000 loops=1)
>     Output: table1alias2.uniqueid
>     Index Cond: ((table1alias2.col20 = '12345'::bigint) AND (table1alias2.
> col8 = ANY ('{c5986b02-3a02-4639-8147-f286972413ba,...
> 98ed24b1-76f5-4b0e-bb94-86cf13a4809c}'::text[])))
>     Heap Fetches: 10
>     Buffers: shared hit=5048

> after 5 executions of the statement:
> ->  Index Only Scan using table1_typea_include_uniqueid_col16_idx on table1
> table1alias2  (cost=0.56..17.23 rows=1 width=36) (actual
> time=125.344..126877.822 rows=65000 loops=1)
>     Output: table1alias2.uniqueid
>     Index Cond: (table1alias2.col20 = $1001)
>     Filter: ((table1alias2.col8)::text = ANY ((ARRAY[$1, ...,
> $1000])::text[]))
>     Rows Removed by Filter: 2670023
>     Heap Fetches: 428
>     Buffers: shared hit=45933 read=42060 dirtied=4

Yeah, this is a dynamic we've seen before.  The rowcount estimate, and
hence the cost estimate, for the plan with explicit parameter values is
way off; but the estimate for the generic plan is even more way off,
causing the system to falsely decide that the latter is cheaper.

I've speculated about refusing to believe generic cost estimates if they are
more than epsilon less than the concrete cost estimate, but it's not quite
clear how that should work or whether it'd have its own failure modes.

The one thing that is totally clear is that these rowcount estimates are
crappy.  Can you improve them by increasing the stats target for that
table?  Maybe with less-garbage-y inputs, the system would make the right
plan choice here.

            regards, tom lane



pgsql-performance by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Please help! Query jumps from 1s -> 4m
Next
From: Arya F
Date:
Subject: Re: 600 million rows of data. Bad hardware or need partitioning?