Re: [HACKERS] Profile of current backend - Mailing list pgsql-hackers

From Mattias Kregert
Subject Re: [HACKERS] Profile of current backend
Date
Msg-id 34DADEB0.60E6313D@algonet.se
Whole thread Raw
In response to Re: [HACKERS] Profile of current backend  (Bruce Momjian <maillist@candle.pha.pa.us>)
Responses Re: [HACKERS] Profile of current backend
Re: [HACKERS] Profile of current backend
Re: [HACKERS] Profile of current backend
List pgsql-hackers
Bruce Momjian wrote:
>
> Interesting.  Nothing is jumping out at me.  Looks like we could try to
> clean up heapgettup() to see if there is anything in there that can be
> speeded up.
>
> None of the calls looks like it should be inlined.  Do you see any that
> look good for inlining?

ExecScan() seems to be the only func which calls SeqNext(), which in
turn accounts for 60% of the calls to heap_getnext(), which does 80% of
the calls to heapgettup().

- Put SeqNext() into ExecScan() to lower function call overhead? [minimal optim.]

- In heapgettup(), 50% is the func itself and 50% is called funcs.
  Top four CPU consumers:
    0.04    0.14    9924/9924        RelationGetBufferWithBuffer [148]
    0.03    0.15    5642/5702        ReleaseAndReadBuffer [145]
    0.10    0.00   26276/42896       nocachegetattr [158]
    0.01    0.08    7111/9607        HeapTupleSatisfiesVisibility [185]

  RelationGetBufferWithBuffer() seems to be called from here only. If so, inline.

- Looking at RelationGetBufferWithBuffer():
    0.00    0.10    4603/32354       ReadBuffer [55]
  ReadBuffer() is the biggest cpu consumer called by RelationGetBufferWithBuffer(). (55%)

  -> *** 97% of ReadBuffer() CPU time is in calling ReadBufferWithBufferLock()

  -> 85% of ReadBufferWithBufferLock() CPU time is in calling BufferAlloc().
  -> ReadBufferWithBufferLock() is the only func calling BufferAlloc().
  -> Conclusion: INLINE BufferAlloc().

- Looking at BufferAlloc():
    0.04    0.25   37974/37974       BufTableLookup [114]
    0.10    0.00   32340/151781      SpinAcquire [81]
    0.10    0.00   37470/40585       PinBuffer [209]
    0.08    0.00   38478/43799       RelationGetLRelId [234]
    0.04    0.00   37974/151781      SpinRelease [175]

  -> 40% of BufferAlloc() CPU time is in calling BufTableLookup().
  -> BufferAlloc() is the only func calling BufTableLookup().
  -> Conclusion: INLINE BufTableLookup().

- Looking at BufTableLookup():
  86% of CPU time is in calling hash_search(). The rest is own time.

- Looking at hash_search():
    0.13    0.41  179189/179189      call_hash [69]
    0.00    0.00       6/6           bucket_alloc [1084]
  -> Conclusion: INLINE call_hash() [and bucket_alloc()] into hash_search().

- Looking at call_hash():
    0.37    0.00  171345/171345      tag_hash [94]
    0.04    0.00    7844/7844        string_hash [348]
  -> Conclusion: INLINE tag_hash() [and string_hash()] into call_hash().
  -> Perhaps disk_hash() could be used in some way? It is currently #ifdef'd away.
  -> Could we use a lookup table instead of doing hash calculations? Would not that
  ->  be much faster?


It looks to me as if there are too many levels of function calls.
Perhaps all functions which are called by only one other func should be inlined?


Guesstimate:
  This would speed up heapgettup() by 10% ???
  Other functions would speed up too.


/* m */

pgsql-hackers by date:

Previous
From: Costin Oproiu
Date:
Subject: Some performance issues (since everybody is testing ... :)
Next
From: jwieck@debis.com (Jan Wieck)
Date:
Subject: Re: [HACKERS] PL code and fmgr_addr