Considering fractional paths in Append node - Mailing list pgsql-hackers

From Nikita Malakhov
Subject Considering fractional paths in Append node
Date
Msg-id CAN-LCVPxnWB39CUBTgOQ9O7Dd8DrA_tpT1EY3LNVnUuvAX1NjA@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hi hackers!

A colleague of mine, Andrei Lepikhov, has found interesting behavior
in path cost calculation for Append node - when evaluating the cheapest
path it does not take into account fractional path costs. 

We've prepared a patch that forces add_paths_to_append_rel function
to consider non-parametrized fractional path costs.

The effect is easily seen in one of standard PG tests:
Vanilla (current master):
explain analyze
select t1.unique1 from tenk1 t1
inner join tenk2 t2 on t1.tenthous = t2.tenthous and t2.thousand = 0
   union all
(values(1)) limit 1;
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..219.55 rows=1 width=4) (actual time=6.309..6.312 rows=1 loops=1)
   ->  Append  (cost=0.00..2415.09 rows=11 width=4) (actual time=6.308..6.310 rows=1 loops=1)
         ->  Nested Loop  (cost=0.00..2415.03 rows=10 width=4) (actual time=6.307..6.308 rows=1 loops=1)
               Join Filter: (t1.tenthous = t2.tenthous)
               Rows Removed by Join Filter: 4210
               ->  Seq Scan on tenk1 t1  (cost=0.00..445.00 rows=10000 width=8) (actual time=0.004..0.057 rows=422 loops=1)
               ->  Materialize  (cost=0.00..470.05 rows=10 width=4) (actual time=0.000..0.014 rows=10 loops=422)
                     Storage: Memory  Maximum Storage: 17kB
                     ->  Seq Scan on tenk2 t2  (cost=0.00..470.00 rows=10 width=4) (actual time=0.076..5.535 rows=10 loops=1)
                           Filter: (thousand = 0)
                           Rows Removed by Filter: 9990
         ->  Result  (cost=0.00..0.01 rows=1 width=4) (never executed)
 Planning Time: 0.324 ms
 Execution Time: 6.336 ms
(14 rows)

Patched, the same test:
explain analyze
select t1.unique1 from tenk1 t1
inner join tenk2 t2 on t1.tenthous = t2.tenthous and t2.thousand = 0
   union all
(values(1)) limit 1;
                                                                    QUERY PLAN                                                                    
--------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.29..126.00 rows=1 width=4) (actual time=0.105..0.106 rows=1 loops=1)
   ->  Append  (cost=0.29..1383.12 rows=11 width=4) (actual time=0.104..0.105 rows=1 loops=1)
         ->  Nested Loop  (cost=0.29..1383.05 rows=10 width=4) (actual time=0.104..0.104 rows=1 loops=1)
               ->  Seq Scan on tenk2 t2  (cost=0.00..470.00 rows=10 width=4) (actual time=0.076..0.076 rows=1 loops=1)
                     Filter: (thousand = 0)
                     Rows Removed by Filter: 421
               ->  Index Scan using tenk1_thous_tenthous on tenk1 t1  (cost=0.29..91.30 rows=1 width=8) (actual time=0.026..0.026 rows=1 loops=1)
                     Index Cond: (tenthous = t2.tenthous)
         ->  Result  (cost=0.00..0.01 rows=1 width=4) (never executed)
 Planning Time: 0.334 ms
 Execution Time: 0.130 ms
(11 rows)

Hope this optimization could be useful.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
Next
From: "Joel Jacobson"
Date:
Subject: Re: New "raw" COPY format