Re: POC: GROUP BY optimization - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: POC: GROUP BY optimization
Date
Msg-id ba2eccf2-d623-c12c-f7c2-a58ac4917c83@enterprisedb.com
Whole thread Raw
In response to Re: POC: GROUP BY optimization  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: POC: GROUP BY optimization  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On 7/20/21 3:12 PM, Tomas Vondra wrote:
> ...
>
> The other comments from the review still apply - I'm particularly
> concerned about the (1) point, i.e. plan changes in postgres_fdw. Those
> seem to be rather strange (LIMIT not being pushed down in queries
> without any grouping). I'd bet this is due to changes in sort costing
> and does not seem very desirable.
> 

I did look at this a bit closer, and yes - this very much seems like a
costing issue in the patch. The first query in postgres_fdw that changes
switches from a query with LIMIT pushed to remote

                              QUERY PLAN
------------------------------------------------------------------------
 Foreign Scan  (cost=196.29..196.52 rows=10 width=14)
   Output: t1.c1, t2.c1, t1.c3
   Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
   Remote SQL: SELECT r1."C 1", r2."C 1", r1.c3 FROM ("S 1"."T 1" r1
               INNER JOIN "S 1"."T 1" r2 ON ... LIMIT ... OFFSET ...
(4 rows)

to the LIMIT executed locally

                              QUERY PLAN
-------------------------------------------------------------------------
 Limit  (cost=241.72..241.94 rows=10 width=14)
   Output: t1.c1, t2.c1, t1.c3
   ->  Foreign Scan  (cost=239.47..261.97 rows=1000 width=14)
         Output: t1.c1, t2.c1, t1.c3
         Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
         Remote SQL: SELECT r1."C 1", r2."C 1", r1.c3 FROM ("S 1"."T 1"
         r1 INNER JOIN ...
(6 rows)

The FDW code runs explain on both queries - with and without LIMIT
pushed to remote side, to get estimates, and without the patch it gets this:

                                  QUERY PLAN
------------------------------------------------------------------------------
 Sort  (cost=106.97..109.47 rows=1000 width=14)
   Sort Key: r1.c3, r1."C 1"
   ->  Hash Join  (cost=33.50..57.14 rows=1000 width=14)
         Hash Cond: (r1."C 1" = r2."C 1")
         ->  Seq Scan on "T 1" r1  (cost=0.00..21.00 rows=1000 width=10)
         ->  Hash  (cost=21.00..21.00 rows=1000 width=4)
               ->  Seq Scan on "T 1" r2  (cost=0.00..21.00 rows=1000
width=4)
(7 rows)

                                     QUERY PLAN
------------------------------------------------------------------------------------
 Limit  (cost=96.29..96.32 rows=10 width=14)
   ->  Sort  (cost=96.04..98.54 rows=1000 width=14)
         Sort Key: r1.c3, r1."C 1"
         ->  Hash Join  (cost=33.50..57.14 rows=1000 width=14)
               Hash Cond: (r1."C 1" = r2."C 1")
               ->  Seq Scan on "T 1" r1  (cost=0.00..21.00 rows=1000
width=10)
               ->  Hash  (cost=21.00..21.00 rows=1000 width=4)
                     ->  Seq Scan on "T 1" r2  (cost=0.00..21.00
rows=1000 width=4)
(8 rows)


while with the patch it gets this:

                                  QUERY PLAN
------------------------------------------------------------------------------
 Sort  (cost=139.47..141.97 rows=1000 width=14)
   Sort Key: r1.c3, r1."C 1"
   ->  Hash Join  (cost=33.50..57.14 rows=1000 width=14)
         Hash Cond: (r1."C 1" = r2."C 1")
         ->  Seq Scan on "T 1" r1  (cost=0.00..21.00 rows=1000 width=10)
         ->  Hash  (cost=21.00..21.00 rows=1000 width=4)
               ->  Seq Scan on "T 1" r2  (cost=0.00..21.00 rows=1000
width=4)
(7 rows)

                                     QUERY PLAN
------------------------------------------------------------------------------------
 Limit  (cost=145.75..145.77 rows=10 width=14)
   ->  Sort  (cost=145.50..148.00 rows=1000 width=14)
         Sort Key: r1.c3, r1."C 1"
         ->  Hash Join  (cost=33.50..57.14 rows=1000 width=14)
               Hash Cond: (r1."C 1" = r2."C 1")
               ->  Seq Scan on "T 1" r1  (cost=0.00..21.00 rows=1000
width=10)
               ->  Hash  (cost=21.00..21.00 rows=1000 width=4)
                     ->  Seq Scan on "T 1" r2  (cost=0.00..21.00
rows=1000 width=4)
(8 rows)


Notice that the costs get "inverted"

                       master            patched
   ----------------------------------------------
    no limit   106.97..109.47     139.47..141.97
    limit       96.29.. 96.32     145.75..145.77

so the limit looks a bit more expensive - just enough so that it seems
cheaper to transfer more rows and execute the limit locally.

IMO this looks rather suspicious (or wrong), and compute_cpu_sort_cost
may need some fixes. I wonder why differs from the old code so much ...


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Have I found an interval arithmetic bug?
Next
From: Jeff Janes
Date:
Subject: Bitmap reuse