optimizer picks smaller table to drive nested loops? - Mailing list pgsql-performance

From Greg Stark
Subject optimizer picks smaller table to drive nested loops?
Date
Msg-id 873chi2o53.fsf@stark.dyndns.tv
Whole thread Raw
Responses Re: optimizer picks smaller table to drive nested loops?  (Randy Neumann <Randy_Neumann@centralref.com>)
Re: optimizer picks smaller table to drive nested loops?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Someone asked a hypothetical question about how to retrieve all records of a
table twice in SQL. It got me thinking about whether there was a way to do
this efficiently.

"Obviously" if you do it using the UNION ALL approach postgres isn't going to
do two separate scans, doing it otherwise would be quite hard.

However using the join approach it seems postgres ought to be able to do a
single sequential scan and return every tuple it finds twice. It doesn't do
this:

slo=> explain analyze select * from region, (select 1 union all select 2) as x;
                                                       QUERY PLAN


------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..11162.00 rows=5534 width=108) (actual time=0.13..541.19 rows=5534 loops=1)
   ->  Subquery Scan x  (cost=0.00..2.00 rows=2 width=0) (actual time=0.03..0.08 rows=2 loops=1)
         ->  Append  (cost=0.00..2.00 rows=2 width=0) (actual time=0.02..0.05 rows=2 loops=1)
               ->  Subquery Scan "*SELECT* 1"  (cost=0.00..1.00 rows=1 width=0) (actual time=0.01..0.02 rows=1 loops=1)
                     ->  Result  (cost=0.00..1.00 rows=1 width=0) (actual time=0.01..0.01 rows=1 loops=1)
               ->  Subquery Scan "*SELECT* 2"  (cost=0.00..1.00 rows=1 width=0) (actual time=0.01..0.02 rows=1 loops=1)
                     ->  Result  (cost=0.00..1.00 rows=1 width=0) (actual time=0.01..0.01 rows=1 loops=1)
   ->  Seq Scan on region  (cost=0.00..2813.00 rows=2767 width=104) (actual time=0.03..123.44 rows=2767 loops=2)
 Total runtime: 566.24 msec
(9 rows)

Wouldn't it be faster to drive the nested loop the other way around?

(I'm also a bit puzzled why the optimizer is calculating that 2,813 * 2 = 5,534)

This is tested on 7.3. I haven't tried CVS yet.

--
greg

pgsql-performance by date:

Previous
From: Richard Huxton
Date:
Subject: Re: Optimizer differences between 7.2 and 7.3
Next
From: "scott.marlowe"
Date:
Subject: Re: Extreme high load averages