Thread: Query performance with min and filter
Hi Postgres community,
We seem to have found a situation where a query run using explain analyse or pgbench is incredibly fast, but run via Java under load performs very poorly – we’ve checked query performance metrics for our Postgres instance and can confirm that it’s the query running slowly, and not a Java problem.
The table has ~7 million rows, with total size of 44GB. As shown below, explain analyse gives a sub 1ms execution time, yet in practice the query takes an average of 2 seconds – it’s worth mentioning that a small percentage of requests do take ~1ms, so something non-deterministic is happening here.
The query in question is below, with explain analyse:
explain analyse SELECT coalesce (
(SELECT min(msg_offset) - 1 FROM events WHERE created_utc >= CURRENT_TIMESTAMP - INTERVAL '60 SECONDS' ),
(SELECT max(msg_offset) FROM events),
0
);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-
Result (cost=2.03..2.04 rows=1 width=8) (actual time=0.183..0.183 rows=1 loops=1)
InitPlan 1 (returns $1)
-> Aggregate (cost=1.56..1.57 rows=1 width=8) (actual time=0.028..0.028 rows=1 loops=1)
-> Index Only Scan using test_4 on events (cost=0.44..1.56 rows=1 width=8) (actual time=0.023..0.023 rows=0 loops=1)
Index Cond: (created_utc >= (now() - '00:01:00'::interval))
Heap Fetches: 0
InitPlan 3 (returns $3)
-> Result (cost=0.45..0.46 rows=1 width=8) (actual time=0.153..0.153 rows=1 loops=1)
InitPlan 2 (returns $2)
-> Limit (cost=0.43..0.45 rows=1 width=8) (actual time=0.152..0.152 rows=1 loops=1)
-> Index Only Scan Backward using events_pkey on events events_1 (cost=0.43..156842.57 rows=7643099 width=8) (actual time=0.151..0.151 rows=1 loops=1)
Index Cond: (msg_offset IS NOT NULL)
Heap Fetches: 0
Planning time: 2.480 ms
Execution time: 0.242 ms
(15 rows)
Output of \d events, with a few test indexes added to try and improve performance:
Table "public.events"
Column | Type | Collation | Nullable | Default
--------------+-----------------------------+-----------+----------+--------------------------------------------
msg_offset | bigint | | not null | nextval('events_msg_offset_seq'::regclass)
msg_key | character varying | | not null |
content_type | character varying | | not null |
type | character varying | | not null |
created_utc | timestamp without time zone | | not null |
data | text | | |
event_size | integer | | not null | 1048576
cluster_id | bigint | | not null | 1
Indexes:
"events_pkey" PRIMARY KEY, btree (msg_offset)
"cluster_idx" btree (msg_key, cluster_id)
"cluster_type_filter_idx" btree (type, cluster_id)
"created_idx" btree (created_utc, msg_key)
"key_idx" btree (msg_key)
"test_1" btree (created_utc, msg_offset)
"test_2" btree (msg_offset, created_utc)
"test_3" btree (created_utc, msg_offset)
"test_4" btree (created_utc, msg_offset DESC)
"type_idx" btree (type)
Any help understanding what is happening here, or guidance to improve performance would be much appreciated.
Regards,
Adam
Sent: Monday, August 31, 2020 2:57 PM
To: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Cc: Cianciaruso, Lorenzo <lorenzo.cianciaruso@tesco.com>
Subject: Query performance with min and filter
Hi Postgres community,
We seem to have found a situation where a query run using explain analyse or pgbench is incredibly fast, but run via Java under load performs very poorly – we’ve checked query performance metrics for our Postgres instance and can confirm that it’s the query running slowly, and not a Java problem.
The table has ~7 million rows, with total size of 44GB. As shown below, explain analyse gives a sub 1ms execution time, yet in practice the query takes an average of 2 seconds – it’s worth mentioning that a small percentage of requests do take ~1ms, so something non-deterministic is happening here.
The query in question is below, with explain analyse:
explain analyse SELECT coalesce (
(SELECT min(msg_offset) - 1 FROM events WHERE created_utc >= CURRENT_TIMESTAMP - INTERVAL '60 SECONDS' ),
(SELECT max(msg_offset) FROM events),
0
);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-
Result (cost=2.03..2.04 rows=1 width=8) (actual time=0.183..0.183 rows=1 loops=1)
InitPlan 1 (returns $1)
-> Aggregate (cost=1.56..1.57 rows=1 width=8) (actual time=0.028..0.028 rows=1 loops=1)
-> Index Only Scan using test_4 on events (cost=0.44..1.56 rows=1 width=8) (actual time=0.023..0.023 rows=0 loops=1)
Index Cond: (created_utc >= (now() - '00:01:00'::interval))
Heap Fetches: 0
InitPlan 3 (returns $3)
-> Result (cost=0.45..0.46 rows=1 width=8) (actual time=0.153..0.153 rows=1 loops=1)
InitPlan 2 (returns $2)
-> Limit (cost=0.43..0.45 rows=1 width=8) (actual time=0.152..0.152 rows=1 loops=1)
-> Index Only Scan Backward using events_pkey on events events_1 (cost=0.43..156842.57 rows=7643099 width=8) (actual time=0.151..0.151 rows=1 loops=1)
Index Cond: (msg_offset IS NOT NULL)
Heap Fetches: 0
Planning time: 2.480 ms
Execution time: 0.242 ms
(15 rows)
Output of \d events, with a few test indexes added to try and improve performance:
Table "public.events"
Column | Type | Collation | Nullable | Default
--------------+-----------------------------+-----------+----------+--------------------------------------------
msg_offset | bigint | | not null | nextval('events_msg_offset_seq'::regclass)
msg_key | character varying | | not null |
content_type | character varying | | not null |
type | character varying | | not null |
created_utc | timestamp without time zone | | not null |
data | text | | |
event_size | integer | | not null | 1048576
cluster_id | bigint | | not null | 1
Indexes:
"events_pkey" PRIMARY KEY, btree (msg_offset)
"cluster_idx" btree (msg_key, cluster_id)
"cluster_type_filter_idx" btree (type, cluster_id)
"created_idx" btree (created_utc, msg_key)
"key_idx" btree (msg_key)
"test_1" btree (created_utc, msg_offset)
"test_2" btree (msg_offset, created_utc)
"test_3" btree (created_utc, msg_offset)
"test_4" btree (created_utc, msg_offset DESC)
"type_idx" btree (type)
Any help understanding what is happening here, or guidance to improve performance would be much appreciated.
Regards,
Adam
"Yorwerth, Adam" <Adam.Yorwerth@tesco.com> writes: > We seem to have found a situation where a query run using explain analyse or pgbench is incredibly fast, but run via Javaunder load performs very poorly – we’ve checked query performance metrics for our Postgres instance and can confirm thatit’s the query running slowly, and not a Java problem. Usually when people report something like this, the explanation is that it's not actually the same query being tested in both cases. In particular, if you're parameterizing the query in Java but substituting constant values when running it manually, that can make a difference. A further-afield possibility is that your application has been set up to run with custom values of work_mem, planner cost constants, etc, which you're failing to duplicate in the manual tests. If all else fails, try installing auto_explain so that you can capture the actual query plan used when the application executes this query. regards, tom lane