Re: Using each rel as both outer and inner for JOIN_ANTI - Mailing list pgsql-hackers
| From | Alvaro Herrera |
|---|---|
| Subject | Re: Using each rel as both outer and inner for JOIN_ANTI |
| Date | |
| Msg-id | 20220809105427.ym3ehoop7dnjame6@alvherre.pgsql Whole thread Raw |
| In response to | Re: Using each rel as both outer and inner for JOIN_ANTI (Richard Guo <guofenglinux@gmail.com>) |
| Responses |
Re: Using each rel as both outer and inner for JOIN_ANTI
|
| List | pgsql-hackers |
Just for kicks, I ran query in your original post under EXPLAIN ANALYZE
in both patched and unpatched with this last version. I got this (best
of three):
Unpatched:
55432 16devel 437532=# explain (analyze, buffers) select * from foo left join bar on foo.a = bar.c where bar.c is
null;
QUERY PLAN
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Hash Anti Join (cost=159039.00..183457.23 rows=10 width=20) (actual time=482.788..483.182 rows=10 loops=1)
Hash Cond: (foo.a = bar.c)
Buffers: shared hit=161 read=21964, temp read=8 written=8
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual time=0.020..0.022 rows=10 loops=1)
Buffers: shared hit=1
-> Hash (cost=72124.00..72124.00 rows=5000000 width=12) (actual time=482.128..482.129 rows=0 loops=1)
Buckets: 262144 Batches: 64 Memory Usage: 2048kB
Buffers: shared hit=160 read=21964
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=12) (actual time=0.092..237.431 rows=5000000
loops=1)
Buffers: shared hit=160 read=21964
Planning Time: 0.182 ms
Execution Time: 483.248 ms
Patched:
QUERY PLAN
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Hash Right Anti Join (cost=1.23..90875.24 rows=10 width=20) (actual time=457.654..457.658 rows=10 loops=1)
Hash Cond: (bar.c = foo.a)
Buffers: shared hit=33 read=22092
-> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=12) (actual time=0.020..229.097 rows=5000000 loops=1)
Buffers: shared hit=32 read=22092
-> Hash (cost=1.10..1.10 rows=10 width=8) (actual time=0.011..0.012 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) (actual time=0.006..0.007 rows=10 loops=1)
Buffers: shared hit=1
Planning Time: 0.067 ms
Execution Time: 457.679 ms
I suppose this looks good as far as the plan goes, but the cost estimation
might be a little bit too optimistic: it is reporting that the new plan
costs 50% of the original, yet the execution time is only 5% lower.
I wonder where does time go (in unpatched) when seqscanning finishes
and before hashing starts.
(I had to disable JIT for the first one, as it insisted on JITting tuple
deforming.)
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green
stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'.
After collecting 500 such letters, he mused, a university somewhere in
Arizona would probably grant him a degree. (Don Knuth)
pgsql-hackers by date: