Home > mailing lists

Re: Please help! Query jumps from 1s -> 4m - Mailing list pgsql-performance

From	James Thompson
Subject	Re: Please help! Query jumps from 1s -> 4m
Date	May 6, 2020 00:10:18
Msg-id	CABoe=cR48w59g7PgVRFacsocWcfdgRejiWomQkxYr3U7OV0b9w@mail.gmail.com Whole thread Raw
In response to	Re: Please help! Query jumps from 1s -> 4m (Justin Pryzby <pryzby@telsasoft.com>)
Responses	Re: Please help! Query jumps from 1s -> 4m (Justin Pryzby <pryzby@telsasoft.com>) Re: Please help! Query jumps from 1s -> 4m (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-performance

Tree view

I've managed to replicate this now with prepared statements. Thanks for all the guidance so far.

The slowness occurs when the prepared statement changes to a generic plan.

Initial plan:

-> Index Only Scan using table1_typea_include_uniqueid_col16_idx on table1 table1alias2 (cost=0.56..2549.70 rows=70 width=36) (actual time=1.901..45.256 rows=65000 loops=1)
Output: table1alias2.uniqueid
Index Cond: ((table1alias2.col20 = '12345'::bigint) AND (table1alias2.col8 = ANY ('{c5986b02-3a02-4639-8147-f286972413ba,... 98ed24b1-76f5-4b0e-bb94-86cf13a4809c}'::text[])))
Heap Fetches: 10
Buffers: shared hit=5048

after 5 executions of the statement:

-> Index Only Scan using table1_typea_include_uniqueid_col16_idx on table1 table1alias2 (cost=0.56..17.23 rows=1 width=36) (actual time=125.344..126877.822 rows=65000 loops=1)
Output: table1alias2.uniqueid
Index Cond: (table1alias2.col20 = $1001)
Filter: ((table1alias2.col8)::text = ANY ((ARRAY[$1, ..., $1000])::text[]))
Rows Removed by Filter: 2670023
Heap Fetches: 428
Buffers: shared hit=45933 read=42060 dirtied=4

The second plan looks worse to me as it's applying a filter rather than using an index condition? I don't understand why it's not part of the condition and also why this is so much slower though.

If I force a retrieval of all index rows for col20 = '12345' using an ad-hoc query (below, which in my mind is what the 'bad' plan is doing), that only takes 2s (2.7 mil rows). Where's the difference?

EXPLAIN (ANALYZE, BUFFERS, TIMING) SELECT COUNT(DISTINCT id) FROM table1 WHERE datatype='TypeA' AND col20 = 12345;

-> Index Only Scan using table1_typea_include_uniqueid_col16_idx on table1 (cost=0.56..2762.95 rows=55337 width=8) (actual time=3.976..1655.645 rows=2735023 loops=1)
Index Cond: (col20 = 12345)
Heap Fetches: 417
Buffers: shared hit=43843 read=44147 dirtied=8

>You could do: SET track_io_timing TO on;

I've not tried this yet, and haven't used it before - sounds like there's some risks associated with running it on a production server / clocks going backwards?

>Also, you should be able to check if that's the problem by doing either:
>plan_cache_mode = force_generic_plan;
>Or (I would think) DISCARD PLANS;

I think plan_cache_mode is just for pg12+ unfortunately? We're on 11 currently.

Just tested DISCARD PLANS locally, it didn't switch back from the generic plan. Was that your expectation?

pgsql-performance by date:

From: Justin Pryzby
Date: 05 May 2020, 22:28:36
Subject: Re: Please help! Query jumps from 1s -> 4m

From: Justin Pryzby
Date: 06 May 2020, 00:35:54
Subject: Re: Please help! Query jumps from 1s -> 4m

Re: Please help! Query jumps from 1s -> 4m - Mailing list pgsql-performance

Previous

Next