Re: Using each rel as both outer and inner for JOIN_ANTI - Mailing list pgsql-hackers
From | Alvaro Herrera |
---|---|
Subject | Re: Using each rel as both outer and inner for JOIN_ANTI |
Date | |
Msg-id | 20220809105427.ym3ehoop7dnjame6@alvherre.pgsql Whole thread Raw |
In response to | Re: Using each rel as both outer and inner for JOIN_ANTI (Richard Guo <guofenglinux@gmail.com>) |
Responses |
Re: Using each rel as both outer and inner for JOIN_ANTI
|
List | pgsql-hackers |
Just for kicks, I ran query in your original post under EXPLAIN ANALYZE in both patched and unpatched with this last version. I got this (best of three): Unpatched: 55432 16devel 437532=# explain (analyze, buffers) select * from foo left join bar on foo.a = bar.c where bar.c is null; QUERY PLAN ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Hash Anti Join (cost=159039.00..183457.23 rows=10 width=20) (actual time=482.788..483.182 rows=10 loops=1) Hash Cond: (foo.a = bar.c) Buffers: shared hit=161 read=21964, temp read=8 written=8 -> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual time=0.020..0.022 rows=10 loops=1) Buffers: shared hit=1 -> Hash (cost=72124.00..72124.00 rows=5000000 width=12) (actual time=482.128..482.129 rows=0 loops=1) Buckets: 262144 Batches: 64 Memory Usage: 2048kB Buffers: shared hit=160 read=21964 -> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=12) (actual time=0.092..237.431 rows=5000000 loops=1) Buffers: shared hit=160 read=21964 Planning Time: 0.182 ms Execution Time: 483.248 ms Patched: QUERY PLAN ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Hash Right Anti Join (cost=1.23..90875.24 rows=10 width=20) (actual time=457.654..457.658 rows=10 loops=1) Hash Cond: (bar.c = foo.a) Buffers: shared hit=33 read=22092 -> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=12) (actual time=0.020..229.097 rows=5000000 loops=1) Buffers: shared hit=32 read=22092 -> Hash (cost=1.10..1.10 rows=10 width=8) (actual time=0.011..0.012 rows=10 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB Buffers: shared hit=1 -> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual time=0.006..0.007 rows=10 loops=1) Buffers: shared hit=1 Planning Time: 0.067 ms Execution Time: 457.679 ms I suppose this looks good as far as the plan goes, but the cost estimation might be a little bit too optimistic: it is reporting that the new plan costs 50% of the original, yet the execution time is only 5% lower. I wonder where does time go (in unpatched) when seqscanning finishes and before hashing starts. (I had to disable JIT for the first one, as it insisted on JITting tuple deforming.) -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'. After collecting 500 such letters, he mused, a university somewhere in Arizona would probably grant him a degree. (Don Knuth)
pgsql-hackers by date: