Re: impact join syntax ?? and gist index ?? - Mailing list pgsql-general

From Tom Lane
Subject Re: impact join syntax ?? and gist index ??
Date
Msg-id 3481346.1673127177@sss.pgh.pa.us
Whole thread Raw
In response to Re: impact join syntax ?? and gist index ??  (Marc Millas <marc.millas@mokadb.com>)
Responses Re: impact join syntax ?? and gist index ??
List pgsql-general
Marc Millas <marc.millas@mokadb.com> writes:
> 2023=# explain (analyze, buffers) select A.x, count(B.x) from A left join B
> on st_within(B.geom, A.geom) group by A.x;

So the problem with this is that the only decently-performant way to
do the join is like

>                      ->  Nested Loop  (cost=0.13..6275745.36 rows=4971999 width=16) (actual time=0.505..11781.817
rows=210651loops=2) 
>                            ->  Parallel Seq Scan on B  (cost=0.00..18454.99 rows=248699 width=40) (actual
time=0.005..22.859rows=211395 loops=2) 
>                            ->  Index Scan using A_geom_idx on A (cost=0.13..25.15 rows=1 width=17752) (actual
time=0.054..0.055rows=1 loops=422789) 
>                                  Index Cond: (geom ~ B.geom)
>                                  Filter: st_within(B.geom, geom)

(Ignore the parallelism, it's not very relevant here.)  There's no
chance for merge or hash join because those require simple equality
join conditions.  The only way to avoid a stupid
compare-every-row-of-A-to-every-row-of-B nestloop is to use a
parameterized inner indexscan, as this plan does.  But that only works
if the join is inner or has the indexed table on the nullable side.
We have no support for nestloop right join, which is what would be
needed to make things run fast with no index on B.

            regards, tom lane



pgsql-general by date:

Previous
From: Marc Millas
Date:
Subject: Re: impact join syntax ?? and gist index ??
Next
From: Marc Millas
Date:
Subject: Re: impact join syntax ?? and gist index ??