Re: Plan for relatively simple query seems to be very inefficient - Mailing list pgsql-performance

From Tom Lane
Subject Re: Plan for relatively simple query seems to be very inefficient
Date
Msg-id 3641.1112817249@sss.pgh.pa.us
Whole thread Raw
In response to Re: Plan for relatively simple query seems to be very inefficient  ("Dave Held" <dave.held@arrayservicesgrp.com>)
List pgsql-performance
Arjen van der Meijden <acmmailing@vulcanus.its.tudelft.nl> writes:
> On 6-4-2005 20:09, Tom Lane wrote:
>> Comparing the nestloop case to the hash case does make one think that
>> there's an awful lot of overhead somewhere, though.  Two int2
>> comparisons ought not take very long :-(.  Arjen, are you interested
>> in getting a gprof profile of what the backend is doing in the nestloop
>> -with-materialize plan?  Or if you don't want to mess with it, please
>> send me the data off-list and I'll run a profile.

> Here you go, both are full pg_dump-dumps with create-data (including the
> index on data_main.range).

Well, indeed int2ge and int2le are pretty far down the list, but the
stuff that's near the top has already been beat on pretty heavily :-(.
I'm not sure there is a lot we can do about this short of a wholesale
redesign of the way we do expression evaluation.

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 36.14     21.30    21.30                             _mcount
  7.62     25.79     4.49 13412606     0.00     0.00  ExecMakeFunctionResultNoSets
  5.46     29.01     3.22 26825216     0.00     0.00  slot_getattr
  4.19     31.48     2.47 26825216     0.00     0.00  ExecEvalVar
  3.87     33.76     2.28 13189120     0.00     0.00  ExecMaterial
  3.38     35.75     1.99 13494688     0.00     0.00  slot_deform_tuple
  3.38     37.74     1.99                             noshlibs
  3.12     39.58     1.84 13353893     0.00     0.00  ExecProcNode
  2.99     41.34     1.76 13107201     0.00     0.00  ExecQual
  2.90     43.05     1.71 13271974     0.00     0.00  AllocSetReset
  2.72     44.65     1.60                             ExecEvalVar
  2.43     46.08     1.43                             $$dyncall
  2.24     47.40     1.32 13271972     0.00     0.00  MemoryContextReset
  2.24     48.72     1.32 13188960     0.00     0.00  tuplestore_gettuple
  2.12     49.97     1.25 13189441     0.00     0.00  ExecStoreTuple
  1.80     51.03     1.06    82689     0.01     0.06  ExecNestLoop
  1.70     52.03     1.00 13354235     0.00     0.00  ExecClearTuple
  1.63     52.99     0.96 13412761     0.00     0.00  check_stack_depth
  1.58     53.92     0.93                             AllocSetReset
  1.29     54.68     0.76                             int2ge
  1.20     55.39     0.71                             ExecMakeFunctionResultNoSets
  1.14     56.06     0.67 13107200     0.00     0.00  int2ge
  1.05     56.68     0.62                             ExecEvalCoerceToDomain
  1.04     57.29     0.61 13189120     0.00     0.00  tuplestore_ateof
  0.64     57.67     0.38 13271972     0.00     0.00  MemoryContextResetChildren
  0.41     57.91     0.24                             readtup_heap
  0.36     58.12     0.21                             log_disconnections
  0.24     58.26     0.14                             BlessTupleDesc
  0.19     58.37     0.11                             ExecCountSlotsMaterial
  0.14     58.45     0.08                             MemoryContextAllocZeroAligned
  0.12     58.52     0.07                             ExecProcNode
  0.10     58.58     0.06                             int42div
  0.08     58.63     0.05                             AllocSetStats
  0.05     58.66     0.03   166022     0.00     0.00  LockBuffer
  0.05     58.69     0.03    82688     0.00     0.00  advance_transition_function
  0.05     58.72     0.03    82080     0.00     0.00  HeapTupleSatisfiesSnapshot
  0.05     58.75     0.03                             ExecInitNestLoop
  0.03     58.77     0.02                             SeqNext
  0.02     58.78     0.01   305408     0.00     0.00  int2le
  0.02     58.79     0.01    84231     0.00     0.00  LWLockAcquire
  0.02     58.80     0.01    82849     0.00     0.00  ExecProject
  0.02     58.81     0.01    82848     0.00     0.00  ExecVariableList
  0.02     58.82     0.01    82844     0.00     0.00  ResourceOwnerEnlargeBuffers
  0.02     58.83     0.01    82844     0.00     0.00  ResourceOwnerRememberBuffer
  0.02     58.84     0.01    82813     0.00     0.00  ReleaseAndReadBuffer
  0.02     58.85     0.01    82688     0.00     0.00  ExecEvalConst
  0.02     58.86     0.01    82688     0.00     0.00  ExecEvalExprSwitchContext
  0.02     58.87     0.01    82688     0.00     0.00  advance_aggregates
  0.02     58.88     0.01    82084     0.00     0.00  heapgettup
  0.02     58.89     0.01    81920     0.00     0.00  ExecMaterialReScan
  0.02     58.90     0.01    81920     0.00     0.00  ExecReScan
  0.02     58.91     0.01       19     0.53     0.53  downcase_truncate_identifier
  0.02     58.92     0.01       10     1.00     1.00  AllocateFile
  0.02     58.93     0.01        1    10.00    70.59  agg_retrieve_direct
[ nothing else shows as having any sample hits ]

_mcount is profiler overhead, in case you were wondering; ignore it and
mentally scale all the other percentages up by 20% or so.

            regards, tom lane

pgsql-performance by date:

Previous
From: Rod Taylor
Date:
Subject: Re: Réf
Next
From: William Yu
Date:
Subject: Re: How to improve db performance with $7K?