Re: Large CASE-statement is pretty slow? - Mailing list pgsql-performance

From Tom Lane
Subject Re: Large CASE-statement is pretty slow?
Date
Msg-id 3223.1079382871@sss.pgh.pa.us
Whole thread Raw
In response to Re: Large CASE-statement is pretty slow?  (Arjen van der Meijden <acmmailing@vulcanus.its.tudelft.nl>)
List pgsql-performance
Arjen van der Meijden <acmmailing@vulcanus.its.tudelft.nl> writes:
> [ huge CASE is pretty slow ]

I did some profiling of the test case that Arjen was kind enough to send
me.  It seems there are two distinct problems.  One is that the parser
uses repeated lappend()'s to construct the list of CASE arms; this
makes building the structure O(N^2) in the number of arms.  (If you
simply try to EXPLAIN the query, you find out that the parse time is
about a third of the run time :-( ... and 90% of that is spent inside
nconc() which is the guts of lappend.)  This problem is slated to be
fixed by Neil Conway's upcoming rewrite of the list support, which will
convert lappend into a constant-time operation.

The other difficulty is that the evaluation machinery for arithmetic
expressions has a lot of overhead.  The profile run shows:

  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 38.15     41.92    41.92   229646     0.00     0.00  nconc
 21.76     65.84    23.92 199054454     0.00     0.00  ExecEvalExpr
 11.38     78.34    12.50    10000     0.00     0.00  ExecEvalCase
  8.43     87.61     9.27 66348151     0.00     0.00  ExecEvalFuncArgs
  8.12     96.54     8.93 66348151     0.00     0.00  ExecMakeFunctionResult
  2.96     99.78     3.25 66348151     0.00     0.00  ExecEvalVar
  1.23    101.14     1.36    10058     0.00     0.00  AllocSetCheck
  1.23    102.49     1.35 66348151     0.00     0.00  ExecEvalOper
  1.12    103.72     1.24    76537     0.00     0.00  OpernameGetCandidates
  0.85    104.66     0.94 66424693     0.00     0.00  int4eq

(Note: I added LIMIT 10000 to the query so that the CASE is only carried
out 10000 times, rather than nearly 90000 times as in Arjen's original
test case.  Without this, the call-counter overflows for ExecEvalExpr,
and the time percentages seem to get confused.  One must recognize
though that this overstates the parser overhead compared to the original
test case.)

Clearly the useful work (int4eq) is getting a bit swamped by the ExecEval
mechanism.  I have some ideas about reducing the overhead, which I'll
post to the pghackers list in a bit.

            regards, tom lane

pgsql-performance by date:

Previous
From: Mike Bridge
Date:
Subject: Re: High CPU with 7.4.1 after running for about 2 weeks
Next
From: "Matthew T. O'Connor"
Date:
Subject: Re: rapid degradation after postmaster restart