Difference in queryplan for array-contains vs unnest - Mailing list pgsql-bugs

From Paul van der Linden
Subject Difference in queryplan for array-contains vs unnest
Date
Msg-id AM0PR0402MB3425518B3D854590B73AF19BECC30@AM0PR0402MB3425.eurprd04.prod.outlook.com
Whole thread Raw
List pgsql-bugs

This is the version for unnest:

SELECT *

FROM foo

LEFT JOIN LATERAL (

  SELECT 1 AS present

  FROM selectionpolies

  WHERE st_intersects(selectionpolies.geo, foo.geo) AND

        selectionpolies.text in (SELECT unnest(countries.countrycodes) AS unnest FROM countries)

  LIMIT 1

) bar ON true

WHERE bar.present IS NULL AND geo && 'BOX(-1.87756 47.05916,-1.60932 47.18086)'::box2d

 

Nested Loop Left Join  (cost=29.87..11662.66 rows=9 width=88)

  Output: foo.id, foo.geo, (1)

  Filter: ((1) IS NULL)

  ->  Bitmap Heap Scan on foo  (cost=29.73..1982.22 rows=1796 width=84)

        Output: foo.id, foo.geo

        Recheck Cond: (foo.geo && '01030000000100000005000000B471C45A7C0AFEBF9D9D0C8E92874740B471C45A7C0AFEBFC993A46B269747408FC70C54C6BFF9BFC993A46B269747408FC70C54C6BFF9BF9D9D0C8E92874740B471C45A7C0AFEBF9D9D0C8E92874740'::geometry)

        ->  Bitmap Index Scan on foo_gidx  (cost=0.00..29.29 rows=1796 width=0)

              Index Cond: (foo.geo && '01030000000100000005000000B471C45A7C0AFEBF9D9D0C8E92874740B471C45A7C0AFEBFC993A46B269747408FC70C54C6BFF9BFC993A46B269747408FC70C54C6BFF9BF9D9D0C8E92874740B471C45A7C0AFEBF9D9D0C8E92874740'::geometry)

  ->  Limit  (cost=0.14..5.37 rows=1 width=4)

        Output: 1

        ->  Nested Loop Semi Join  (cost=0.14..5.37 rows=1 width=4)

              Output: 1

              Join Filter: (selectionpolies.text = (unnest('{nl}'::text[])))

              ->  Index Scan using selectionpolies_gidx on public.selectionpolies  (cost=0.14..2.60 rows=1 width=8)

                    Output: selectionpolies.text, selectionpolies.geo

                    Index Cond: (selectionpolies.geo && foo.geo)

                    Filter: _st_intersects(selectionpolies.geo, foo.geo)

              ->  ProjectSet  (cost=0.00..0.52 rows=100 width=32)

                    Output: unnest('{nl}'::text[])

                    ->  Result  (cost=0.00..0.01 rows=1 width=0)

 

For array contains:

SELECT *

FROM foo

LEFT JOIN LATERAL (

  SELECT 1 AS present

  FROM selectionpolies

  WHERE st_intersects(selectionpolies.geo, foo.geo) AND

       array[selectionpolies.text]<@( SELECT countries.countrycodes FROM countries)

  LIMIT 1

) bar ON true

WHERE bar.present IS NULL AND geo && 'BOX(-1.87756 47.05916,-1.60932 47.18086)'::box2d

 

Nested Loop Left Join  (cost=29.88..6714.68 rows=9 width=88)

  Output: foo.id, foo.geo, (1)

  Filter: ((1) IS NULL)

  ->  Bitmap Heap Scan on foo  (cost=29.73..1982.22 rows=1796 width=84)

        Output: foo.id, foo.geo

        Recheck Cond: (foo.geo && '01030000000100000005000000B471C45A7C0AFEBF9D9D0C8E92874740B471C45A7C0AFEBFC993A46B269747408FC70C54C6BFF9BFC993A46B269747408FC70C54C6BFF9BF9D9D0C8E92874740B471C45A7C0AFEBF9D9D0C8E92874740'::geometry)

        ->  Bitmap Index Scan on foo_gidx  (cost=0.00..29.29 rows=1796 width=0)

              Index Cond: (foo.geo && '01030000000100000005000000B471C45A7C0AFEBF9D9D0C8E92874740B471C45A7C0AFEBFC993A46B269747408FC70C54C6BFF9BFC993A46B269747408FC70C54C6BFF9BF9D9D0C8E92874740B471C45A7C0AFEBF9D9D0C8E92874740'::geometry)

  ->  Limit  (cost=0.15..2.62 rows=1 width=4)

        Output: 1

        InitPlan 1 (returns $0)

          ->  Result  (cost=0.00..0.01 rows=1 width=32)

                Output: '{nl}'::text[]

        ->  Index Scan using selectionpolies_gidx on public.selectionpolies  (cost=0.14..2.61 rows=1 width=4)

              Output: 1

              Index Cond: (selectionpolies.geo && foo.geo)

              Filter: ((ARRAY[selectionpolies.text] <@ $0) AND _st_intersects(selectionpolies.geo, foo.geo))

 

So in the array-contains case, the selectionpolies are filtered on the text column, while in the unnest case that filtering is postponed to the nested loop semi join (and thus calculating the st_intersects on too much (large!) polygons) despite the planner knowing it’s only one record.

 

Obviously the workaround will be to use the array-contains version, but I thought I mention it anyway because this could be a serious performance dip

 

Paul van der Linden | MapCreator

Bogert 31-06, 5612 LX, Eindhoven, The Netherlands

Main Office: +31 40 264 5120

MapCreator.eu |  cid:image001.png@01D2FCB9.37565600  cid:image002.png@01D2FCB9.37565600  cid:image003.png@01D2FCB9.37565600 


cid:image004.png@01D2FCB9.37565600
   
    cid:image005.png@01D2FCB9.37565600  cid:image006.png@01D2FCB9.37565600

 

Attachment

pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #15506: Foreign data wrapper (postgres_fdw) unexpected behavior
Next
From: "Purushotham"
Date:
Subject: Am unable to return after update/insert execute, let me know is it the way to create or else any other way?