Re: Performance impact of NULLs and variable length fields - Mailing list pgsql-general

From Martijn van Oosterhout
Subject Re: Performance impact of NULLs and variable length fields
Date
Msg-id 20010722123750.A16356@svana.org
Whole thread Raw
In response to Performance impact of NULLs and variable length fields  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general
Well, Tom pointed out that without the query itself this was fairly
meaningless. So here it is:

select c.id, sum(d.cost)
from c, l, d
where c.id = l.cid
and l.clid = d.clid
and c.accman = '2500'
and billid is null
group by c.id;

EXPLAIN output:
Aggregate  (cost=51479.55..51507.07 rows=550 width=48) (actual time=26192.07..26222.07 rows=156 loops=1)
  ->  Group  (cost=51479.55..51493.31 rows=5503 width=48) (actual time=26165.81..26186.14 rows=2029 loops=156)
        ->  Sort  (cost=51479.55..51479.55 rows=5503 width=48) (actual time=26164.90..26167.77 rows=2029 loops=1)
              ->  Hash Join  (cost=676.79..51137.62 rows=5503 width=48) (actual time=3703.82..26122.23 rows=2029
loops=1)
                    ->  Seq Scan on d  (cost=0.00..50150.07 rows=32262 width=12) (actual time=67.35..24921.42
rows=41045loops=1) 
                    ->  Hash  (cost=671.31..671.31 rows=2193 width=36) (actual time=594.55..594.55 rows=0 loops=1)
                          ->  Merge Join  (cost=260.50..671.31 rows=2193 width=36) (actual time=251.95..551.40
rows=1330loops=1) 
                                ->  Index Scan using l_cid on l  (cost=0.00..364.36 rows=6308 width=20) (actual
time=33.95..340.52rows=6307 loops=1) 
                                ->  Sort  (cost=260.50..260.50 rows=1305 width=16) (actual time=142.85..145.34
rows=1337loops=1) 
                                      ->  Seq Scan on c  (cost=0.00..192.94 rows=1305 width=16) (actual
time=14.99..103.40rows=1314 loops=1) 

d has over 1.4 million rows, l about 10,000 and c about 2,000.

NULL columns and variable length fields used extensivly.

On Sat, Jul 21, 2001 at 05:25:38PM +1000, Martijn van Oosterhout wrote:
> Sometime ago somebody asked if it made a difference adn it was suggested
> that the effect was probably marginal. I ran a profiler over postgres doing
> a large query and these are the top 10 functions:
>
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls  ms/call  ms/call  name
>  16.04      0.51     0.51  1676772     0.00     0.00  nocachegetattr
>  11.95      0.89     0.38  1427403     0.00     0.00  heapgettup
>  10.06      1.21     0.32  2955372     0.00     0.00  LockBuffer
>   6.92      1.43     0.22  3406475     0.00     0.00  ExecEvalExpr
>   5.03      1.59     0.16  1617018     0.00     0.00  AllocSetReset
>   4.40      1.73     0.14  1427403     0.00     0.00  heap_getnext
>   4.09      1.86     0.13  1844339     0.00     0.00  ExecEvalVar
>   4.09      1.99     0.13  1441330     0.00     0.00  ReleaseBuffer
>   4.09      2.12     0.13  1427064     0.00     0.00  SeqNext
>   3.46      2.23     0.11  1552338     0.00     0.00  ExecQual

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: COPY failure
Next
From: Joshua Jore
Date:
Subject: Re: Microsoft SQL Server Replication