Re: Optimizer questions - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Optimizer questions
Date
Msg-id CAPpHfdtUe1Ow_nMyv5PaiO7K9o5T+-KYg5cMtQZQrQ8zuseOaA@mail.gmail.com
Whole thread Raw
In response to Re: Optimizer questions  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: Optimizer questions  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
List pgsql-hackers
On Fri, Jan 8, 2016 at 11:58 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
Attached please find improved version of the optimizer patch for LIMIT clause.
Now I try to apply this optimization only for non-trivial columns requiring evaluation.
May be it will be better to take in account cost of this columns evaluation but right now I just detect non-variable columns.

We may think about more general feature: LIMIT pushdown. In the Konstantin's patch planner push LIMIT before tlist calculation.
But there are other cases when calculating LIMIT first would be beneficial. For instance, we can do LIMIT before JOINs. That is possible only for LEFT JOIN which is not used in filter and ordering clauses. See the example below.

# create table t1 as (select i, random() v from generate_series(1,1000000) i);
SELECT 1000000

# create table t2 as (select i, random() v from generate_series(1,1000000) i);
SELECT 1000000

# explain analyze select * from t1 left join t2 on t1.i = t2.i order by t1.v limit 10;
                                                              QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=87421.64..87421.67 rows=10 width=24) (actual time=1486.276..1486.278 rows=10 loops=1)
   ->  Sort  (cost=87421.64..89921.64 rows=1000000 width=24) (actual time=1486.275..1486.275 rows=10 loops=1)
         Sort Key: t1.v
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Hash Left Join  (cost=27906.00..65812.00 rows=1000000 width=24) (actual time=226.180..1366.238 rows=1000000
               Hash Cond: (t1.i = t2.i)
               ->  Seq Scan on t1  (cost=0.00..15406.00 rows=1000000 width=12) (actual time=0.010..77.041 rows=1000000 l
               ->  Hash  (cost=15406.00..15406.00 rows=1000000 width=12) (actual time=226.066..226.066 rows=1000000 loop
                     Buckets: 131072  Batches: 1  Memory Usage: 46875kB
                     ->  Seq Scan on t2  (cost=0.00..15406.00 rows=1000000 width=12) (actual time=0.007..89.002 rows=100
 Planning time: 0.123 ms
 Execution time: 1492.118 ms
(12 rows)

# explain analyze select * from (select * from t1 order by t1.v limit 10) t1 left join t2 on t1.i = t2.i;
                                                              QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Hash Right Join  (cost=37015.89..56171.99 rows=10 width=24) (actual time=198.478..301.278 rows=10 loops=1)
   Hash Cond: (t2.i = t1.i)
   ->  Seq Scan on t2  (cost=0.00..15406.00 rows=1000000 width=12) (actual time=0.005..74.303 rows=1000000 loops=1)
   ->  Hash  (cost=37015.77..37015.77 rows=10 width=12) (actual time=153.584..153.584 rows=10 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 1kB
         ->  Limit  (cost=37015.64..37015.67 rows=10 width=12) (actual time=153.579..153.580 rows=10 loops=1)
               ->  Sort  (cost=37015.64..39515.64 rows=1000000 width=12) (actual time=153.578..153.578 rows=10 loops=1)
                     Sort Key: t1.v
                     Sort Method: top-N heapsort  Memory: 25kB
                     ->  Seq Scan on t1  (cost=0.00..15406.00 rows=1000000 width=12) (actual time=0.012..78.828 rows=100
 Planning time: 0.132 ms
 Execution time: 301.308 ms
(12 rows)

In this example LIMIT pushdown makes query 5 times faster. It would be very nice if optimizer make this automatically.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Move PinBuffer and UnpinBuffer to atomics
Next
From: Tom Lane
Date:
Subject: Re: Sequence Access Method WIP