Performance regression when adding LIMIT 1 to a query - Mailing list pgsql-general
From | Costa Alexoglou |
---|---|
Subject | Performance regression when adding LIMIT 1 to a query |
Date | |
Msg-id | CAJ+5Ff7s1NxcyZ4x-FdrygRT0KMuyROVPrP+Pm_1Z2DwMdv07Q@mail.gmail.com Whole thread Raw |
Responses |
Re: Performance regression when adding LIMIT 1 to a query
|
List | pgsql-general |
Hey folks,
I faced an interesting regression and would love to have some help understanding why this happened.
The postgres version if it's of any use is PostgreSQL 16.3 .
The postgres version if it's of any use is PostgreSQL 16.3 .
The following query:
```
EXPLAIN ANALYZE
SELECT "databases_metrics"."metrics"
FROM
"databases_metrics"
WHERE
(
"databases_metrics"."created_at" >= '2023-03-15 10:00:00+00:00'::timestamptz
AND "databases_metrics"."db_instance_id" = 'c4c97a60-b88e-4cd3-a2f1-random-uuid'::UUID
AND "databases_metrics"."created_at" <= '2025-04-03 10:00:00+00:00'::timestamptz
AND (
metrics -> 'perf_average_query_runtime' IS NOT NULL
)
)
ORDER BY
"databases_metrics"."id" ASC;
EXPLAIN ANALYZE
SELECT "databases_metrics"."metrics"
FROM
"databases_metrics"
WHERE
(
"databases_metrics"."created_at" >= '2023-03-15 10:00:00+00:00'::timestamptz
AND "databases_metrics"."db_instance_id" = 'c4c97a60-b88e-4cd3-a2f1-random-uuid'::UUID
AND "databases_metrics"."created_at" <= '2025-04-03 10:00:00+00:00'::timestamptz
AND (
metrics -> 'perf_average_query_runtime' IS NOT NULL
)
)
ORDER BY
"databases_metrics"."id" ASC;
```
with a plan:
```
Gather Merge (cost=115584.47..118515.35 rows=25120 width=824) (actual time=46.004..74.267 rows=29653 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=114584.45..114615.85 rows=12560 width=824) (actual time=41.200..47.322 rows=9884 loops=3)
Sort Key: id
Sort Method: external merge Disk: 16360kB
Worker 0: Sort Method: external merge Disk: 15552kB
Worker 1: Sort Method: external merge Disk: 14536kB
-> Parallel Bitmap Heap Scan on databases_metrics (cost=990.77..109175.83 rows=12560 width=824) (actual time=3.326..14.295 rows=9884 loops=3)
Recheck Cond: ((db_instance_id = 'c4c97a60-b88e-4cd3-a2f1-random-uuid'::uuid) AND (created_at >= '2023-03-15 10:00:00+00'::timestamp with time zone) AND (created_at <= '2025-04-03 10:00:00+00'::timestamp with time zone))"},
Filter: ((metrics -> 'perf_average_query_runtime'::text) IS NOT NULL)
Rows Removed by Filter: 68
Heap Blocks: exact=4272
-> Bitmap Index Scan on idx_databases_metrics_instance_date_custom_created_debugging (cost=0.00..983.24 rows=30294 width=0) (actual time=3.786.786 rows=29856 loops=1)"},
Index Cond: ((db_instance_id = 'c4c97a60-b88e-4cd3-a2f1-random-uuid'::uuid) AND (created_at >= '2023-03-15 10:00:00+00'::timestamp with time zone) AND (created_at <= '2025-04-03 10:00:00+00'::timestamp with time zone))"},
Planning Time: 0.126 ms
Execution Time: 79.334 ms
```
Results in an execution time of `79.334ms` which is ok.
Then if I only add a `LIMIT 1` at the end of the query, I get dramatically slower execution:
with a plan:
```
Gather Merge (cost=115584.47..118515.35 rows=25120 width=824) (actual time=46.004..74.267 rows=29653 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=114584.45..114615.85 rows=12560 width=824) (actual time=41.200..47.322 rows=9884 loops=3)
Sort Key: id
Sort Method: external merge Disk: 16360kB
Worker 0: Sort Method: external merge Disk: 15552kB
Worker 1: Sort Method: external merge Disk: 14536kB
-> Parallel Bitmap Heap Scan on databases_metrics (cost=990.77..109175.83 rows=12560 width=824) (actual time=3.326..14.295 rows=9884 loops=3)
Recheck Cond: ((db_instance_id = 'c4c97a60-b88e-4cd3-a2f1-random-uuid'::uuid) AND (created_at >= '2023-03-15 10:00:00+00'::timestamp with time zone) AND (created_at <= '2025-04-03 10:00:00+00'::timestamp with time zone))"},
Filter: ((metrics -> 'perf_average_query_runtime'::text) IS NOT NULL)
Rows Removed by Filter: 68
Heap Blocks: exact=4272
-> Bitmap Index Scan on idx_databases_metrics_instance_date_custom_created_debugging (cost=0.00..983.24 rows=30294 width=0) (actual time=3.786.786 rows=29856 loops=1)"},
Index Cond: ((db_instance_id = 'c4c97a60-b88e-4cd3-a2f1-random-uuid'::uuid) AND (created_at >= '2023-03-15 10:00:00+00'::timestamp with time zone) AND (created_at <= '2025-04-03 10:00:00+00'::timestamp with time zone))"},
Planning Time: 0.126 ms
Execution Time: 79.334 ms
```
Results in an execution time of `79.334ms` which is ok.
Then if I only add a `LIMIT 1` at the end of the query, I get dramatically slower execution:
```
EXPLAIN ANALYZE
SELECT "databases_metrics"."metrics"
FROM
"databases_metrics"
WHERE
(
"databases_metrics"."created_at" >= '2023-03-15 10:00:00+00:00'::timestamptz
AND "databases_metrics"."db_instance_id" = 'c4c97a60-b88e-4cd3-a2f1-random-uuid'::UUID
AND "databases_metrics"."created_at" <= '2025-04-03 10:00:00+00:00'::timestamptz
AND (
metrics -> 'perf_average_query_runtime' IS NOT NULL
)
)
ORDER BY
"databases_metrics"."id" ASC LIMIT 1;
```
With a plan:
```
Limit (cost=0.43..229.66 rows=1 width=824) (actual time=7538.004..7538.005 rows=1 loops=1)
-> Index Scan using databases_metrics_pkey on databases_metrics (cost=0.43..6909156.38 rows=30142 width=824) (actual time=7538.002..7538.003 rows=1 loops=1)
Filter: ((created_at >= '2023-03-15 10:00:00+00'::timestamp with time zone) AND (created_at <= '2025-04-03 10:00:00+00'::timestamp with time zone) A((metrics -> 'perf_average_query_runtime'::text) IS NOT NULL) AND (db_instance_id = 'c4c97a60-b88e-4cd3-a2f1-random-uuid'::uuid))"},
Rows Removed by Filter: 10244795
Planning Time: 0.128 ms
Execution Time: 7538.032 ms
```
Any clue why this may be happening?
EXPLAIN ANALYZE
SELECT "databases_metrics"."metrics"
FROM
"databases_metrics"
WHERE
(
"databases_metrics"."created_at" >= '2023-03-15 10:00:00+00:00'::timestamptz
AND "databases_metrics"."db_instance_id" = 'c4c97a60-b88e-4cd3-a2f1-random-uuid'::UUID
AND "databases_metrics"."created_at" <= '2025-04-03 10:00:00+00:00'::timestamptz
AND (
metrics -> 'perf_average_query_runtime' IS NOT NULL
)
)
ORDER BY
"databases_metrics"."id" ASC LIMIT 1;
```
With a plan:
```
Limit (cost=0.43..229.66 rows=1 width=824) (actual time=7538.004..7538.005 rows=1 loops=1)
-> Index Scan using databases_metrics_pkey on databases_metrics (cost=0.43..6909156.38 rows=30142 width=824) (actual time=7538.002..7538.003 rows=1 loops=1)
Filter: ((created_at >= '2023-03-15 10:00:00+00'::timestamp with time zone) AND (created_at <= '2025-04-03 10:00:00+00'::timestamp with time zone) A((metrics -> 'perf_average_query_runtime'::text) IS NOT NULL) AND (db_instance_id = 'c4c97a60-b88e-4cd3-a2f1-random-uuid'::uuid))"},
Rows Removed by Filter: 10244795
Planning Time: 0.128 ms
Execution Time: 7538.032 ms
```
Any clue why this may be happening?
pgsql-general by date: