BUG #17544: Join order for INNER JOIN ... USING with GROUP BY on join column affects selected query plan - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #17544: Join order for INNER JOIN ... USING with GROUP BY on join column affects selected query plan |
Date | |
Msg-id | 17544-ebd06b00b8836a04@postgresql.org Whole thread Raw |
Responses |
Re: BUG #17544: Join order for INNER JOIN ... USING with GROUP BY on join column affects selected query plan
Re: BUG #17544: Join order for INNER JOIN ... USING with GROUP BY on join column affects selected query plan |
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17544 Logged by: Eric Sheng Email address: _@ericsheng.com PostgreSQL version: 14.4 Operating system: Ubuntu 20.04 Description: It seems like the join order of a simple INNER JOIN ... USING(...) affects query plan selection even when only joining two tables, when used with GROUP BY on the join column. Schema: groups with priority, containing zero or more tasks with a boolean CREATE TABLE groups(group_id INT PRIMARY KEY NOT NULL, priority INT); CREATE TABLE tasks(task_id INT PRIMARY KEY NOT NULL, group_id INT NOT NULL REFERENCES groups(group_id), finished BOOLEAN NOT NULL); CREATE INDEX groups_priority_index ON groups(priority, group_id); CREATE INDEX tasks_group_id_index ON tasks(group_id); Data: 1000 groups (random priority) with 1000 tasks per group DO $$ BEGIN FOR i IN 0 .. 999 LOOP INSERT INTO groups(group_id, priority) VALUES(i, FLOOR(RANDOM() * 2000000000)::INT); FOR j in 0 .. 999 LOOP INSERT INTO tasks(task_id, group_id, finished) VALUES (1000 * i + j, i, RANDOM() * 100 < 99); END LOOP; END LOOP; END; $$; Queries: fetching the lowest priority groups and whether or not all tasks in them have finished (same query twice, just different join order) EXPLAIN (ANALYZE, TIMING) SELECT group_id, BOOL_AND(finished) FROM groups INNER JOIN tasks USING(group_id) GROUP BY group_id, priority ORDER BY priority ASC LIMIT 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=15771.99..15772.02 rows=10 width=9) (actual time=293.054..302.919 rows=10 loops=1) -> Sort (cost=15771.99..15777.64 rows=2260 width=9) (actual time=293.051..302.914 rows=10 loops=1) Sort Key: groups.priority Sort Method: top-N heapsort Memory: 25kB -> Finalize HashAggregate (cost=15700.56..15723.16 rows=2260 width=9) (actual time=292.744..302.777 rows=1000 loops=1) Group Key: groups.group_id Batches: 1 Memory Usage: 241kB -> Gather (cost=15203.36..15677.96 rows=4520 width=9) (actual time=291.308..301.725 rows=2836 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial HashAggregate (cost=14203.36..14225.96 rows=2260 width=9) (actual time=263.344..263.560 rows=945 loops=3) Group Key: groups.group_id Batches: 1 Memory Usage: 241kB Worker 0: Batches: 1 Memory Usage: 241kB Worker 1: Batches: 1 Memory Usage: 241kB -> Hash Join (cost=60.85..11725.61 rows=495550 width=9) (actual time=0.719..182.977 rows=333333 loops=3) Hash Cond: (tasks.group_id = groups.group_id) -> Parallel Seq Scan on tasks (cost=0.00..10361.50 rows=495550 width=5) (actual time=0.044..61.366 rows=333333 loops=3) -> Hash (cost=32.60..32.60 rows=2260 width=8) (actual time=0.627..0.627 rows=1000 loops=3) Buckets: 4096 Batches: 1 Memory Usage: 72kB -> Seq Scan on groups (cost=0.00..32.60 rows=2260 width=8) (actual time=0.042..0.298 rows=1000 loops=3) Planning Time: 0.310 ms Execution Time: 303.294 ms EXPLAIN (ANALYZE, TIMING) SELECT group_id, BOOL_AND(finished) FROM tasks INNER JOIN groups USING(group_id) GROUP BY group_id, priority ORDER BY priority ASC LIMIT 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.70..1.35 rows=10 width=9) (actual time=1.146..10.606 rows=10 loops=1) -> GroupAggregate (cost=0.70..64932.27 rows=1000000 width=9) (actual time=1.144..10.599 rows=10 loops=1) Group Key: groups.priority, tasks.group_id -> Nested Loop (cost=0.70..47432.27 rows=1000000 width=9) (actual time=0.055..8.028 rows=10001 loops=1) -> Index Only Scan using groups_priority_index on groups (cost=0.28..43.27 rows=1000 width=8) (actual time=0.028..0.036 rows=11 loops=1) Heap Fetches: 0 -> Index Scan using tasks_group_id_index on tasks (cost=0.42..37.39 rows=1000 width=5) (actual time=0.017..0.433 rows=909 loops=11) Index Cond: (group_id = groups.group_id) Planning Time: 0.878 ms Execution Time: 10.672 ms All join orders for inner joins give semantically equivalent results, and documentation at https://www.postgresql.org/docs/current/explicit-joins.html indicates that the planner should explore all join orders unless there are too many tables, so I would have expected the join order here to be immaterial to the query plan chosen. Digging a bit deeper, this appears to be due to the USING clause causing the GROUP BY group_id to be rewritten to `groups.group_id` in the first query and `tasks.group_id` in the second query, and the former resulting in the simpler plan not being used. The same difference can be observed running: EXPLAIN (ANALYZE, TIMING) SELECT groups.group_id, BOOL_AND(finished) FROM tasks INNER JOIN groups ON tasks.group_id = groups.group_id GROUP BY groups.group_id, priority ORDER BY priority ASC LIMIT 10; (the slow plan) EXPLAIN (ANALYZE, TIMING) SELECT tasks.group_id, BOOL_AND(finished) FROM tasks INNER JOIN groups ON tasks.group_id = groups.group_id GROUP BY tasks.group_id, priority ORDER BY priority ASC LIMIT 10; (the fast plan)
pgsql-bugs by date: