Performance impact of NULLs and variable length fields - Mailing list pgsql-general

From Martijn van Oosterhout
Subject Performance impact of NULLs and variable length fields
Date
Msg-id 20010721172538.A8911@svana.org
Whole thread Raw
Responses Re: Performance impact of NULLs and variable length fields  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Performance impact of NULLs and variable length fields  (Martijn van Oosterhout <kleptog@svana.org>)
Re: Performance impact of NULLs and variable length fields  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-general
Sometime ago somebody asked if it made a difference adn it was suggested
that the effect was probably marginal. I ran a profiler over postgres doing
a large query and these are the top 10 functions:

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 16.04      0.51     0.51  1676772     0.00     0.00  nocachegetattr
 11.95      0.89     0.38  1427403     0.00     0.00  heapgettup
 10.06      1.21     0.32  2955372     0.00     0.00  LockBuffer
  6.92      1.43     0.22  3406475     0.00     0.00  ExecEvalExpr
  5.03      1.59     0.16  1617018     0.00     0.00  AllocSetReset
  4.40      1.73     0.14  1427403     0.00     0.00  heap_getnext
  4.09      1.86     0.13  1844339     0.00     0.00  ExecEvalVar
  4.09      1.99     0.13  1441330     0.00     0.00  ReleaseBuffer
  4.09      2.12     0.13  1427064     0.00     0.00  SeqNext
  3.46      2.23     0.11  1552338     0.00     0.00  ExecQual

Note: the wall clock time of the query was about 30 seconds. The total cpu
time was 3.18 seconds.

Now, nocachegetattr is called mostly whenever there are NULLs or variable
length strings in the tuple. Since our biggest table begins with a variable
length field, every single lookup is going to be uncached. Looks like the
effect is significant.

Is this ever going to change?

Just for interests sake i've also got the results for a plain select sum(x)
from y;

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
  9.12      0.60     0.60  1423541     0.00     0.00  heapgettup
  8.51      1.16     0.56  1423329     0.00     0.00  nocachegetattr
  6.08      1.56     0.40  2850372     0.00     0.00  AllocSetAlloc
  5.47      1.92     0.36  1423308     0.00     0.00  ExecTargetList
  5.02      2.25     0.33  2846619     0.00     0.00  AllocSetReset
  4.10      2.52     0.27  1423308     0.00     0.00  heap_formtuple
  3.80      2.77     0.25  2846614     0.00     0.00  ExecEvalVar
  3.80      3.02     0.25        2   125.00   922.33  ExecAgg
  3.65      3.26     0.24  1423310     0.00     0.00  ExecProcNode
  3.04      3.46     0.20  1423307     0.00     0.00  advance_transition_function

Seems to me that the generality of the aggregate implementation doesn't seem
to really be an issue.

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

pgsql-general by date:

Previous
From: x
Date:
Subject: COPY failure
Next
From: "Richard Huxton"
Date:
Subject: Re: Microsoft SQL Server Replication