Query performance with min and filter - Mailing list pgsql-general
From | Yorwerth, Adam |
---|---|
Subject | Query performance with min and filter |
Date | |
Msg-id | 0548FF3D-05E1-4C9F-91F5-B2EDA69A589D@tesco.com Whole thread Raw |
Responses |
Re: Query performance with min and filter
Re: Query performance with min and filter |
List | pgsql-general |
Hi Postgres community,
We seem to have found a situation where a query run using explain analyse or pgbench is incredibly fast, but run via Java under load performs very poorly – we’ve checked query performance metrics for our Postgres instance and can confirm that it’s the query running slowly, and not a Java problem.
The table has ~7 million rows, with total size of 44GB. As shown below, explain analyse gives a sub 1ms execution time, yet in practice the query takes an average of 2 seconds – it’s worth mentioning that a small percentage of requests do take ~1ms, so something non-deterministic is happening here.
The query in question is below, with explain analyse:
explain analyse SELECT coalesce (
(SELECT min(msg_offset) - 1 FROM events WHERE created_utc >= CURRENT_TIMESTAMP - INTERVAL '60 SECONDS' ),
(SELECT max(msg_offset) FROM events),
0
);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-
Result (cost=2.03..2.04 rows=1 width=8) (actual time=0.183..0.183 rows=1 loops=1)
InitPlan 1 (returns $1)
-> Aggregate (cost=1.56..1.57 rows=1 width=8) (actual time=0.028..0.028 rows=1 loops=1)
-> Index Only Scan using test_4 on events (cost=0.44..1.56 rows=1 width=8) (actual time=0.023..0.023 rows=0 loops=1)
Index Cond: (created_utc >= (now() - '00:01:00'::interval))
Heap Fetches: 0
InitPlan 3 (returns $3)
-> Result (cost=0.45..0.46 rows=1 width=8) (actual time=0.153..0.153 rows=1 loops=1)
InitPlan 2 (returns $2)
-> Limit (cost=0.43..0.45 rows=1 width=8) (actual time=0.152..0.152 rows=1 loops=1)
-> Index Only Scan Backward using events_pkey on events events_1 (cost=0.43..156842.57 rows=7643099 width=8) (actual time=0.151..0.151 rows=1 loops=1)
Index Cond: (msg_offset IS NOT NULL)
Heap Fetches: 0
Planning time: 2.480 ms
Execution time: 0.242 ms
(15 rows)
Output of \d events, with a few test indexes added to try and improve performance:
Table "public.events"
Column | Type | Collation | Nullable | Default
--------------+-----------------------------+-----------+----------+--------------------------------------------
msg_offset | bigint | | not null | nextval('events_msg_offset_seq'::regclass)
msg_key | character varying | | not null |
content_type | character varying | | not null |
type | character varying | | not null |
created_utc | timestamp without time zone | | not null |
data | text | | |
event_size | integer | | not null | 1048576
cluster_id | bigint | | not null | 1
Indexes:
"events_pkey" PRIMARY KEY, btree (msg_offset)
"cluster_idx" btree (msg_key, cluster_id)
"cluster_type_filter_idx" btree (type, cluster_id)
"created_idx" btree (created_utc, msg_key)
"key_idx" btree (msg_key)
"test_1" btree (created_utc, msg_offset)
"test_2" btree (msg_offset, created_utc)
"test_3" btree (created_utc, msg_offset)
"test_4" btree (created_utc, msg_offset DESC)
"type_idx" btree (type)
Any help understanding what is happening here, or guidance to improve performance would be much appreciated.
Regards,
Adam
pgsql-general by date: