Thread: Nested loop does not preserve order. Why?
Hello! Let us have a table T(A, B), an index on T(B, A) and a much smaller table V(B). stat-dev1/ui_dev_16 12:27:07 > create table u as select a, a % 1000 b from generate_series(1, 10000000) a; SELECT 10000000 stat-dev1/ui_dev_16 12:27:28 > create table v as select distinct b from u where b % 100 = 1; SELECT 10 stat-dev1/ui_dev_16 12:27:41 > analyze t; ANALYZE stat-dev1/ui_dev_16 12:27:42 > analyze v; ANALYZE stat-dev1/ui_dev_16 12:27:45 > create index u_i on u(b,a); CREATE INDEX And we are looking for all rows of T that have B present in V and sorted by B, A. The most natural way to obtain them would be to take all records from V ordered by B, and to perform an index-only scan on T for each B. But not for postgresql :( stat-dev1/ui_dev_16 12:28:57 > EXPLAIN ANALYZE select b, a from u where b in (select b from v) order by b, a; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Sort (cost=879810.77..892310.77 rows=5000000 width=8) (actual time=4728.831..4756.802 rows=100000 loops=1) Sort Key: u.b, u.a Sort Method: external merge Disk: 1760kB -> Hash Join (cost=1.35..186749.35 rows=5000000 width=8) (actual time=0.121..4568.170 rows=100000 loops=1) Hash Cond: (u.b = v.b) -> Seq Scan on u (cost=0.00..144248.00 rows=10000000 width=8) (actual time=0.027..2294.474 rows=10000000 loops=1) -> Hash (cost=1.23..1.23 rows=10 width=4) (actual time=0.071..0.071 rows=10 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> HashAggregate (cost=1.12..1.23 rows=10 width=4) (actual time=0.062..0.063 rows=10 loops=1) -> Seq Scan on v (cost=0.00..1.10 rows=10 width=4) (actual time=0.050..0.052 rows=10 loops=1) Total runtime: 4771.817 ms (11 rows) Replacing IN by EXISTS changes nothing. Putting an ORDER BY clause into IN(...) unsurprisingly does not help. The only way to obtain the desired plan I found is to rewrite the query: stat-dev1/ui_dev_16 12:29:06 > set enable_bitmapscan to off; SET stat-dev1/ui_dev_16 12:29:11 > set enable_seqscan to off; SET stat-dev1/ui_dev_16 12:29:13 > EXPLAIN ANALYZE select b, a from u where b = any((select array_agg(b) from v)::int[]) order by b, a; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- Index Only Scan using u_i on u (cost=10000000001.58..10000315253.45 rows=99551 width=8) (actual time=0.078..103.239 rows=100000 loops=1) Index Cond: (b = ANY ($0)) Heap Fetches: 100000 InitPlan 1 (returns $0) -> Aggregate (cost=10000000001.13..10000000001.14 rows=1 width=4) (actual time=0.018..0.018 rows=1 loops=1) -> Seq Scan on v (cost=10000000000.00..10000000001.10 rows=10 width=4) (actual time=0.006..0.009 rows=10 loops=1) Total runtime: 119.929 ms (7 rows) The plan I expect for the original query is something like that: Nested Loop Preserving Order (cost=1.85..317895.69 rows=100000 width=8) -> HashAggregate (cost=1.42..1.52 rows=10 width=4) -> Sort (cost=1.27..1.29 rows=10 width=4) Sort Key: v.b -> Seq Scan on v (cost=0.00..1.10 rows=10 width=4) -> Index Only Scan using u_i on u (cost=0.43..31689.42 rows=10000 width=8) Index Cond: (b = v.b) But if we try to hint postgresql we will realize that nested loop does not preserve order (see below). Why? stat-dev1/ui_dev_16 12:42:32 > set enable_bitmapscan to off; SET Time: 0,895 ms stat-dev1/ui_dev_16 12:42:33 > set enable_hashjoin to off; SET Time: 0,762 ms stat-dev1/ui_dev_16 12:42:36 > EXPLAIN select b, a from u where b in (select b from v order by b) order by b, a; QUERY PLAN -------------------------------------------------------------------------------------- Sort (cost=326200.51..326450.51 rows=100000 width=8) Sort Key: u.b, u.a -> Nested Loop (cost=1.85..317895.69 rows=100000 width=8) -> HashAggregate (cost=1.42..1.52 rows=10 width=4) -> Sort (cost=1.27..1.29 rows=10 width=4) Sort Key: v.b -> Seq Scan on v (cost=0.00..1.10 rows=10 width=4) -> Index Only Scan using u_i on u (cost=0.43..31689.42 rows=10000 width=8) Index Cond: (b = v.b) (9 rows)
Alexey Bashtanov <bashtanov@imap.cc> writes: > Let us have a table T(A, B), an index on T(B, A) and a much smaller > table V(B). > And we are looking for all rows of T that have B present in V and sorted > by B, A. > The most natural way to obtain them would be to take all records from V > ordered by B, and to perform an index-only scan on T for each B. > But not for postgresql :( AFAICS, what you're wishing is that given a plan like nestloop Join condition: x = y -> path ordered by x -> path ordered by y, z the planner would realize that the output of the nestloop is ordered by y,z. Right now it only realizes that the output is ordered by x (and hence also by y); this means it thinks it needs an additional sort step. As stated, though, this deduction about ordering is false. To make it true, you need the additional assumption that the outer relation is unique in x (plus assumptions about the equality operators in question all being compatible with the sort ordering). Right now, the planner doesn't really track uniqueness so there's no practical way to do this. There's a patch in the current commitfest that wants to add tracking of unique sort orders, so if that gets in we might have appropriate infrastructure. But even then, this seems like a mighty narrow and unusual corner case; I can't recall ever seeing a request for such behavior before. So I doubt we'd do it unless the deduction turns out to be basically free, which seems unlikely. Having said that, there does seem to be something fishy going on here. If I add an index on v.b to your example, and drop the explicit request to sort on a, I can get this: EXPLAIN select b, a from u where b in (select b from v) order by b; QUERY PLAN ----------------------------------------------------------------------------------------------- Sort (cost=13802.17..14052.17 rows=100000 width=8) Sort Key: u.b -> Nested Loop (cost=8.89..4128.85 rows=100000 width=8) -> Unique (cost=8.45..8.50 rows=10 width=4) -> Sort (cost=8.45..8.48 rows=10 width=4) Sort Key: v.b -> Index Only Scan using v_b_idx on v (cost=0.14..8.29 rows=10 width=4) -> Index Only Scan using u_i on u (cost=0.43..312.04 rows=10000 width=8) Index Cond: (b = v.b) Planning time: 0.481 ms (10 rows) I'd have expected the planner to realize that neither of those sort steps are necessary. Will investigate. regards, tom lane