Thread: not exactly a bug report, but surprising behaviour

not exactly a bug report, but surprising behaviour

From
Gregory Stark
Date:
It seems the sort operation is done very late in the process, after functions
in the select column list have already been called and the results stored.
This makes using sequences to generate a sequential list of numbers have a
surprising behaviour.

I don't know what standards say about this, I imagine they don't have anything
relevant because sequences are nonstandard. But I'm pretty sure I recall doing
queries like this all the time in Oracle and not seeing behaviour like this.

Note that this makes the resulting data different if the index exists vs if it
doesn't. I see the same behaviour with INSERT (...) (SELECT...) so it's not
just CREATE TABLE AS doing it.

As I said I'm not sure this is wrong, just wanted to mention it in case it's
not intentional.


slo=> create sequence w;
CREATE SEQUENCE
slo=> create table w1 (w integer);
CREATE TABLE
slo=> insert into w1 values (1);
INSERT 229135376 1
slo=> insert into w1 values (2);
INSERT 229135377 1
slo=> insert into w1 values (3);
INSERT 229135378 1
slo=> create table w2 as (select nextval('w'),w from w1 order by w desc);
SELECT
slo=> select * from w2;
 nextval | w
---------+---
       3 | 3
       2 | 2
       1 | 1
(3 rows)

slo=> create index idx_w on w1(w);
CREATE INDEX
slo=> set enable_seqscan = off;
SET
slo=> create table w3 as (select nextval('w'),w from w1 order by w desc);
SELECT
slo=> select * from w3;
 nextval | w
---------+---
       4 | 3
       5 | 2
       6 | 1
(3 rows)


--
greg

Re: not exactly a bug report, but surprising behaviour

From
Bruno Wolff III
Date:
On Tue, Feb 04, 2003 at 14:11:20 -0500,
  Gregory Stark <gsstark@mit.edu> wrote:
>
> It seems the sort operation is done very late in the process, after functions
> in the select column list have already been called and the results stored.
> This makes using sequences to generate a sequential list of numbers have a
> surprising behaviour.

You can't count on that behavior anyway. Sequences only guarenty a unique
number. Other transactions can grab values from the sequence while
your transaction is running.

Re: not exactly a bug report, but surprising behaviour

From
Stephan Szabo
Date:
On 4 Feb 2003, Gregory Stark wrote:

> It seems the sort operation is done very late in the process, after functions
> in the select column list have already been called and the results stored.
> This makes using sequences to generate a sequential list of numbers have a
> surprising behaviour.

Select list entries are done before order by since you can order by the
output of a select list entry.

> slo=> create sequence w;
> CREATE SEQUENCE
> slo=> create table w1 (w integer);
> CREATE TABLE
> slo=> insert into w1 values (1);
> INSERT 229135376 1
> slo=> insert into w1 values (2);
> INSERT 229135377 1
> slo=> insert into w1 values (3);
> INSERT 229135378 1
> slo=> create table w2 as (select nextval('w'),w from w1 order by w desc);
> SELECT
> slo=> select * from w2;
>  nextval | w
> ---------+---
>        3 | 3
>        2 | 2
>        1 | 1
> (3 rows)

I think the query that would give you what you want in this case is
something like the following to force the order before doing the nextvals:

create table w2 as select nextval('w'), w from (select w from w1 order by
w desc) w1;

However, I'm not sure that you can count on that working in all cases
either.




Re: not exactly a bug report, but surprising behaviour

From
Greg Stark
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

> Select list entries are done before order by since you can order by the
> output of a select list entry.

I understood that, in fact I can't quite figure out how Oracle could possibly
do it differently. Perhaps if the sort refers only to data available
immediately it does the sort before calculating the output columns.

This would have a couple big advantages:

1) If columns were excluded from the results from limit/offset clauses then
   possibly expensive functions in the select list wouldn't have to be
   calculated. It also means if the sort is quick but the functions slow that
   the first rows would be returned quickly to the application even if the
   total time was the same.

2) The sort could be done on pointers to the tuples rather than pushing the
   data in the tuple around in the sort. Obviously the transaction integrity
   issues are tricky with this though. But it could save a lot of memory
   pressure from sorts.

On the other hand, then you get two different behaviours depending on whether
any output columns are listed in the order clause, which is a nonobvious side
effect.

> I think the query that would give you what you want in this case is
> something like the following to force the order before doing the nextvals:

I don't need a solution for my purposes. I just created the table and then did
an update. Of course that isn't guaranteed either since it depends on the
physical layout of the records and the behaviour of update, but it worked.

--
greg

Re: not exactly a bug report, but surprising behaviour

From
Stephan Szabo
Date:
On 4 Feb 2003, Greg Stark wrote:

> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
>
> > Select list entries are done before order by since you can order by the
> > output of a select list entry.
>
> I understood that, in fact I can't quite figure out how Oracle could possibly
> do it differently. Perhaps if the sort refers only to data available
> immediately it does the sort before calculating the output columns.

Does it always do it when it's only on input columns? Or only in some
cases?  It might be interesting to see.  For example, the same sort of
thing with an index existing/not existing might make a different or the
plan chosen.

> This would have a couple big advantages:
>
> 1) If columns were excluded from the results from limit/offset clauses then
>    possibly expensive functions in the select list wouldn't have to be
>    calculated. It also means if the sort is quick but the functions slow that
>    the first rows would be returned quickly to the application even if the
>    total time was the same.

That's a useful advantage.  You can probably get this effect from pushing
clauses into from list subselects as a current optimization, but it'd be
nice since that's not an obvious conversion.

> 2) The sort could be done on pointers to the tuples rather than pushing the
>    data in the tuple around in the sort. Obviously the transaction integrity
>    issues are tricky with this though. But it could save a lot of memory
>    pressure from sorts.

This'd be an advantage in the case of a single table.  Unless you can
guarantee the join keeps the ordering properties, I don't think this'd
help with anything that's doing joins or anything more complicated.


Re: not exactly a bug report, but surprising behaviour

From
Greg Stark
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

> > This would have a couple big advantages:
> >
> > 1) If columns were excluded from the results from limit/offset clauses then
> >    possibly expensive functions in the select list wouldn't have to be
> >    calculated. It also means if the sort is quick but the functions slow that
> >    the first rows would be returned quickly to the application even if the
> >    total time was the same.
>
> That's a useful advantage.  You can probably get this effect from pushing
> clauses into from list subselects as a current optimization, but it'd be
> nice since that's not an obvious conversion.

Not if the clause depends on the sort results such as limit/offset or having
clauses.

> > 2) The sort could be done on pointers to the tuples rather than pushing the
> >    data in the tuple around in the sort. Obviously the transaction integrity
> >    issues are tricky with this though. But it could save a lot of memory
> >    pressure from sorts.
>
> This'd be an advantage in the case of a single table.  Unless you can
> guarantee the join keeps the ordering properties, I don't think this'd
> help with anything that's doing joins or anything more complicated.

Joins shouldn't be a problem, just sort a list of pointers to records and have
a structure on the side saying which tables each pointer corresponds to and
which columns and expressions are requested from each. I'm not sure how to
retain transactional integrity with this though.

The tricky part would be evaluating when it would help. When sorting many
records with a narrow result set it wouldn't pay to generate pointers that
would be just as large as the original data and possibly push the original
data out of cache. But for sorting a limited number of records of many columns
each it would be way more efficient.

I'm watching this query I'm working on now drive the cpu to 100% for 6+ hours
with virtually no I/O. And I think it's the best I can do. All the times is
being spent moving bits around from one place to another. I'm entirely blocked
on memory throughput right now. The fewer data are actually pushed around the
faster sorts will be.

It occurs to me that it's possible postgres is doing this already at a lower
level of abstraction. The sort engine could load all the records to sort into
an array and generate a list of indexes into that array and sort the indexes.
But I get the impression from the previous conversation that that's not what's
happening.

I'm constantly surprised by how much work there is to build a good database
engine. But I'm also constantly amazed by how much postgres is already doing.
It's really quite remarkable how complete it is.

--
greg

Re: not exactly a bug report, but surprising behaviour

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> I'm watching this query I'm working on now drive the cpu to 100% for 6+ hours
> with virtually no I/O. And I think it's the best I can do. All the times is
> being spent moving bits around from one place to another.

It would be interesting to see a gprof profile of that.

> It occurs to me that it's possible postgres is doing this already at a lower
> level of abstraction.

Yes, tuplesort.c has heard of pushing pointers around rather than
copying records.  I wonder though whether data is being pushed out to
kernel disk buffers and then back in again --- do you have sort_mem
set large enough?  Another likely theory is that the interface layers
needed to access datatype-specific comparison routines are chewing the
cycles.  Need facts not speculation to know where the bottleneck is...

            regards, tom lane

Re: not exactly a bug report, but surprising behaviour

From
Stephan Szabo
Date:
On 5 Feb 2003, Greg Stark wrote:

>
> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
>
> > > This would have a couple big advantages:
> > >
> > > 1) If columns were excluded from the results from limit/offset clauses then
> > >    possibly expensive functions in the select list wouldn't have to be
> > >    calculated. It also means if the sort is quick but the functions slow that
> > >    the first rows would be returned quickly to the application even if the
> > >    total time was the same.
> >
> > That's a useful advantage.  You can probably get this effect from pushing
> > clauses into from list subselects as a current optimization, but it'd be
> > nice since that's not an obvious conversion.
>
> Not if the clause depends on the sort results such as limit/offset or having
> clauses.

Sorry, I thought we were speaking specifically of the
select expensive_function(...), col1, ... from table1, ... order by col1,
... limit ... case which probably safely can be written into
select expensive_function(...), col1, ... from (select col1, ... from
table1, ... order by col1, ... limit ...) foo;


> > > 2) The sort could be done on pointers to the tuples rather than pushing the
> > >    data in the tuple around in the sort. Obviously the transaction integrity
> > >    issues are tricky with this though. But it could save a lot of memory
> > >    pressure from sorts.
> >
> > This'd be an advantage in the case of a single table.  Unless you can
> > guarantee the join keeps the ordering properties, I don't think this'd
> > help with anything that's doing joins or anything more complicated.
>
> Joins shouldn't be a problem, just sort a list of pointers to records and have
> a structure on the side saying which tables each pointer corresponds to and
> which columns and expressions are requested from each. I'm not sure how to
> retain transactional integrity with this though.

Hmm, are you talking about pointers to the original tables or pointers to
the output of the last step?  The latter should be fine, but the former is
problematic I'd think.


Re: not exactly a bug report, but surprising behaviour

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > I'm watching this query I'm working on now drive the cpu to 100% for 6+ hours
> > with virtually no I/O. And I think it's the best I can do. All the times is
> > being spent moving bits around from one place to another.
>
> It would be interesting to see a gprof profile of that.

Ok, I have a copy of postgres built with -pg now. I'm running grpof on the
gmon.out file but all I'm seeing is this:

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
100.00      0.01     0.01        1    10.00    10.00  InitFreeSpaceMap
  0.00      0.01     0.00    12296     0.00     0.00  LWLockAssign
  0.00      0.01     0.00      323     0.00     0.00  GUC_yylex
  0.00      0.01     0.00      123     0.00     0.00  ShmemAlloc
  0.00      0.01     0.00      123     0.00     0.00  tas
  0.00      0.01     0.00       79     0.00     0.00  AllocSetFreeIndex
...

I assume this is only the postmaster initialization as I've run a smaller
version of the job which took 8 minutes. Is there an easy trick to getting
gmon.out output from a specific backend process?


> > It occurs to me that it's possible postgres is doing this already at a lower
> > level of abstraction.
>
> Yes, tuplesort.c has heard of pushing pointers around rather than
> copying records.  I wonder though whether data is being pushed out to
> kernel disk buffers and then back in again --- do you have sort_mem
> set large enough?

Yes, this is the job I asked about sort_mem for. I have it up to 64M and no
pgsql_tmp files are being created.

> Another likely theory is that the interface layers needed to access
> datatype-specific comparison routines are chewing the cycles. Need facts not
> speculation to know where the bottleneck is...

--
greg

Re: not exactly a bug report, but surprising behaviour

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Ok, I have a copy of postgres built with -pg now. I'm running grpof on the
> gmon.out file but all I'm seeing is this:

> I assume this is only the postmaster initialization as I've run a smaller
> version of the job which took 8 minutes. Is there an easy trick to getting
> gmon.out output from a specific backend process?

Two things: one, backends dump gmon.out into $PGDATA/base/yourdb/,
whereas the postmaster dumps into wherever you started it (it never does
chdir IIRC).  Two, if you're on Linux you must compile postmaster.c with
-DLINUX_PROFILE, or you won't get any profiling timer interrupts in
child processes.

Your sample looks like the second problem...

            regards, tom lane

Re: not exactly a bug report, but surprising behaviour

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
>
> Two things: one, backends dump gmon.out into $PGDATA/base/yourdb/,
> whereas the postmaster dumps into wherever you started it (it never does
> chdir IIRC).  Two, if you're on Linux you must compile postmaster.c with
> -DLINUX_PROFILE, or you won't get any profiling timer interrupts in
> child processes.
>
> Your sample looks like the second problem...

Actually I had both problems, I didn't realize the backends did a chdir.

I cancelled it partway through.

  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 16.59     97.88    97.88 69608411     0.00     0.00  ExecMakeFunctionResult
  7.08    139.65    41.77 79472258     0.00     0.00  comparetup_heap
  4.50    166.17    26.52 192211935     0.00     0.00  ExecEvalExpr
  3.94    189.41    23.24 155884536     0.00     0.00  AllocSetAlloc
  2.57    204.59    15.18 69608411     0.00     0.00  ExecEvalFuncArgs
  2.57    219.76    15.17 17360796     0.00     0.00  heap_formtuple
  2.25    233.03    13.27 248222443     0.00     0.00  AllocSetFreeIndex
  2.22    246.11    13.08 17360724     0.00     0.00  ExecTargetList
  2.17    258.92    12.81 45871623     0.00     0.00  ExecEvalParam
  1.99    270.65    11.73 17360796     0.00     0.00  DataFill
  1.82    281.37    10.72 23020517     0.00     0.00  AllocSetReset
  1.57    290.62     9.25 29962044     0.00     0.00  ExecEvalVar
  1.29    298.20     7.58 17360796     0.00     0.00  ComputeDataSize
  1.17    305.09     6.89 92337151     0.00     0.00  AllocSetFree
  1.13    311.74     6.64 97183879     0.00     0.00  myFunctionCall2
  1.11    318.27     6.54 155883526     0.00     0.00  MemoryContextAlloc
  1.07    324.57     6.30 97183879     0.00     0.00  inlineApplySortFunction
  1.05    330.75     6.17 17285775     0.00     0.00  newNode

--
greg

Re: not exactly a bug report, but surprising behaviour

From
Neil Conway
Date:
On Wed, 2003-02-05 at 15:49, Greg Stark wrote:
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls   s/call   s/call  name
>  16.59     97.88    97.88 69608411     0.00     0.00  ExecMakeFunctionResult
>   7.08    139.65    41.77 79472258     0.00     0.00  comparetup_heap
>   4.50    166.17    26.52 192211935     0.00     0.00  ExecEvalExpr

(Semi OT)

Annoyingly enough, I get similarly strange gprof output (all zeros in
the two right hand columns) on Debian -- PostgreSQL compiled with
-DLINUX_PROFILE -pg using gcc 3.2.2pre, glibc 2.3.1, binutils
2.13.90.0.16. I haven't been able to figure out what the problem is,
though.

If anyone has any suggestions on how to fix this, please speak up.

Cheers,

Neil
--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC




Re: not exactly a bug report, but surprising behaviour

From
Greg Stark
Date:
Neil Conway <neilc@samurai.com> writes:

> On Wed, 2003-02-05 at 15:49, Greg Stark wrote:
> >   %   cumulative   self              self     total
> >  time   seconds   seconds    calls   s/call   s/call  name
> >  16.59     97.88    97.88 69608411     0.00     0.00  ExecMakeFunctionResult
> >   7.08    139.65    41.77 79472258     0.00     0.00  comparetup_heap
> >   4.50    166.17    26.52 192211935     0.00     0.00  ExecEvalExpr
>
> (Semi OT)
>
> Annoyingly enough, I get similarly strange gprof output (all zeros in
> the two right hand columns) on Debian -- PostgreSQL compiled with
> -DLINUX_PROFILE -pg using gcc 3.2.2pre, glibc 2.3.1, binutils
> 2.13.90.0.16. I haven't been able to figure out what the problem is,
> though.
>
> If anyone has any suggestions on how to fix this, please speak up.

Well this is Debian also, but I don't think that's related. Those two columns
are "seconds per call". Look at how many calls those functions have, they take
very little time per call, much less than 1/100th of a second.

--
greg

Re: not exactly a bug report, but surprising behaviour

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Neil Conway <neilc@samurai.com> writes:
>> On Wed, 2003-02-05 at 15:49, Greg Stark wrote:
> %   cumulative   self              self     total
> time   seconds   seconds    calls   s/call   s/call  name
> 16.59     97.88    97.88 69608411     0.00     0.00  ExecMakeFunctionResult
> 7.08    139.65    41.77 79472258     0.00     0.00  comparetup_heap
> 4.50    166.17    26.52 192211935     0.00     0.00  ExecEvalExpr
>>
>> Annoyingly enough, I get similarly strange gprof output (all zeros in
>> the two right hand columns) on Debian --

> Well this is Debian also, but I don't think that's related.

Right.  Zeroes in the per-call columns are not unexpected.  If you get
zeroes in the "self seconds" or "calls" fields then you have a
measurement issue ... but what we're seeing here is just ye olde death
of a thousand cuts, viz a lot of calls on routines that don't take very
long in any one call.

It is annoying that ExecMakeFunctionResult seems to be chewing so much
CPU time, since it's nothing more than an interface routine.  But
offhand I have no ideas on how to improve the situation.

            regards, tom lane

Re: not exactly a bug report, but surprising behaviour

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > Neil Conway <neilc@samurai.com> writes:
> >> On Wed, 2003-02-05 at 15:49, Greg Stark wrote:
> > %   cumulative   self              self     total
> > time   seconds   seconds    calls   s/call   s/call  name
> > 16.59     97.88    97.88 69608411     0.00     0.00  ExecMakeFunctionResult
> > 7.08    139.65    41.77 79472258     0.00     0.00  comparetup_heap
> > 4.50    166.17    26.52 192211935     0.00     0.00  ExecEvalExpr
> >>
> >> Annoyingly enough, I get similarly strange gprof output (all zeros in
> >> the two right hand columns) on Debian --
>
> > Well this is Debian also, but I don't think that's related.
>
> Right.  Zeroes in the per-call columns are not unexpected.  If you get
> zeroes in the "self seconds" or "calls" fields then you have a
> measurement issue ... but what we're seeing here is just ye olde death
> of a thousand cuts, viz a lot of calls on routines that don't take very
> long in any one call.

Though it looks from the legend as if these columns really ought to be
"ms/call" not "s/call". Even so they would still be 0.00. It looks like
ExecMakeFunctionResult is taking about 1.6us/call.

> It is annoying that ExecMakeFunctionResult seems to be chewing so much
> CPU time, since it's nothing more than an interface routine.  But
> offhand I have no ideas on how to improve the situation.

Does ExecMakeFunctionResult have to do with pgpsql functions? Or internal
functions? I have a normal function that's being used in the query. It in turn
calls sqrt, pow, and cos. Should I inline my function manually? Should I do
... * ... instead of pow(..., 2) to calculate the square?

But I think this result is deceptive. It's still only 17% of the time. There
doesn't seem to be a way to get the linux gprof to display the total time per
function including children. That would indicate the place where algorithmic
changes would have the biggest impact.

--
greg

Re: not exactly a bug report, but surprising behaviour

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> But I think this result is deceptive. It's still only 17% of the time. There
> doesn't seem to be a way to get the linux gprof to display the total time per
> function including children.

Er ... did you look all the way through the output?  The standard gprof
output shows the flat timings you quoted followed by a call-tree report.

            regards, tom lane

Re: not exactly a bug report, but surprising behaviour

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
>
> Er ... did you look all the way through the output?  The standard gprof
> output shows the flat timings you quoted followed by a call-tree report.

Well, yes but I wasn't sure what to look for. It sort of looks to me like
gprof chose an odd place to root the recursive cycle which makes the result
confusing. At least to me.

index % time    self  children    called     name
[1]     88.3  194.70  326.46     228+534992216 <cycle 2 as a whole> [1]
               97.88   35.00 69608411             ExecMakeFunctionResult <cycle 2> [5]
                3.68  121.86 5731762             postquel_execute <cycle 2> [6]
               13.08   39.77 17360724             ExecTargetList <cycle 2> [17]
               26.52   23.01 192211935             ExecEvalExpr <cycle 2> [18]
                1.74   32.89 6127527             ExecScan <cycle 2> [23]
                0.67   30.64  448463             ExecUnique <cycle 2> [26]
               15.18    0.00 69608411             ExecEvalFuncArgs <cycle 2> [47]
                4.16    8.86 11463396             ExecEvalArrayRef <cycle 2> [54]
                2.74    8.95 5731958             ExecutePlan <cycle 2> [60]
                1.48    8.56 5412663             ExecSort <cycle 2> [65]
                2.70    7.04 17360724             ExecProject <cycle 2> [66]
                3.11    3.13 5560629             ExecNestLoop <cycle 2> [86]
                2.00    3.23 5731890             ExecResult <cycle 2> [93]
                5.20    0.00 23281174             ExecProcNode <cycle 2> [94]
                4.16    0.05 35012021             ExecEvalFunc <cycle 2> [104]
                3.28    0.34 5731762             fmgr_sql <cycle 2> [113]
                3.21    0.03 34596390             ExecEvalOper <cycle 2> [117]
                1.18    1.70 5731958             ExecutorRun <cycle 2> [122]
                1.12    0.00 5731762             postquel_getnext <cycle 2> [152]
                0.47    0.35 5748363             ExecQual <cycle 2> [167]
                0.81    0.00 5678894             ExecIndexScan <cycle 2> [168]
                0.00    0.49    2116             make_op <cycle 2> [182]
                0.13    0.32   59133             ExecIndexReScan <cycle 2> [187]
                0.12    0.00   59133             ExecReScan <cycle 2> [217]
                0.00    0.05    1054             ParseFuncOrColumn <cycle 2> [244]
                0.05    0.00  448463             SubqueryNext <cycle 2> [245]
                0.00    0.04     184             pg_parse_and_rewrite <cycle 2> [263]
                0.01    0.02    4657             simplify_op_or_func <cycle 2> [266]
                0.00    0.03    1005             coerce_to_target_type <cycle 2> [270]
                0.00    0.03    1735             coerce_type <cycle 2> [276]
                0.00    0.03    9859             expression_tree_mutator <cycle 2> [277]
                0.00    0.02     759             typecast_expression <cycle 2> [291]
                0.01    0.01     198+409         set_plan_references <cycle 2> [292]
                0.01    0.00  448463             ExecSubqueryScan <cycle 2> [313]
                0.00    0.00   60857             ExecEvalExprSwitchContext <cycle 2> [350]
                0.00    0.00     242             transformArraySubscripts <cycle 2> [366]
                0.00    0.00     184             init_sql_fcache <cycle 2> [373]
                0.00    0.00     446+6484        transformExpr <cycle 2> [375]
                0.00    0.00   10680             eval_const_expressions_mutator <cycle 2> [377]
                0.00    0.00     198             transformSelectStmt <cycle 2> [387]
                0.00    0.00       5             set_base_rel_pathlists <cycle 2> [391]
                0.00    0.00     202             pg_analyze_and_rewrite <cycle 2> [395]
                0.00    0.00       5             make_one_rel <cycle 2> [408]
                0.00    0.00       5             subplanner <cycle 2> [417]
                0.00    0.00     202             parse_analyze <cycle 2> [438]
                0.00    0.00     199             query_planner <cycle 2> [440]
                0.00    0.00     184             init_execution_state <cycle 2> [450]
                0.00    0.00     200             transformTargetEntry <cycle 2> [453]
                0.00    0.00     198             transformTargetList <cycle 2> [463]
                0.00    0.00     198             planner <cycle 2> [489]
                0.00    0.00       1             transformInsertStmt <cycle 2> [527]
                0.00    0.00     199             grouping_planner <cycle 2> [536]
                0.00    0.00     597             preprocess_expression <cycle 2> [538]
                0.00    0.00     199             subquery_planner <cycle 2> [540]
                0.00    0.00       1             ExecAgg <cycle 2> [557]
                0.00    0.00       7             create_plan <cycle 2> [562]
                0.00    0.00      11             transformColumnRef <cycle 2> [571]
                0.00    0.00       1             set_subquery_pathlist <cycle 2> [610]
                0.00    0.00      27             join_references_mutator <cycle 2> [637]
                0.00    0.00     199+8           pull_up_subqueries <cycle 2> [651]
                0.00    0.00     396             transformWhereClause <cycle 2> [667]
                0.00    0.00       1             create_nestloop_plan <cycle 2> [708]
                0.00    0.00       8             ResolveNew_mutator <cycle 2> [757]
                0.00    0.00     203             transformStmt <cycle 2> [860]
                0.00    0.00       1             set_uppernode_references <cycle 2> [887]
                0.00    0.00       1             create_join_plan <cycle 2> [919]
                0.00    0.00    4232             make_operand <cycle 2> [994]
                0.00    0.00    1837             eval_const_expressions <cycle 2> [1005]
                0.00    0.00    1266             transformIndirection <cycle 2> [1008]
                0.00    0.00    1054             make_arguments <cycle 2> [1015]
                0.00    0.00     199+6           preprocess_qual_conditions <cycle 2> [1069]
                0.00    0.00     198             pg_plan_query <cycle 2> [1072]
                0.00    0.00     170             ExecSeqScan <cycle 2> [1084]
                0.00    0.00      33             replace_correlation_vars_mutator <cycle 2> [1149]
                0.00    0.00       5             join_references <cycle 2> [1220]
                0.00    0.00       3             replace_vars_with_subplan_refs_mutator <cycle 2> [1276]
                0.00    0.00       3             SS_replace_correlation_vars <cycle 2> [1266]
                0.00    0.00       3             ResolveNew <cycle 2> [1265]
                0.00    0.00       2             replace_vars_with_subplan_refs <cycle 2> [1325]
                0.00    0.00       1+1           resolvenew_in_jointree <cycle 2> [1449]
                0.00    0.00       1             set_join_references <cycle 2> [1453]
-----------------------------------------------
                0.00   23.77      11/201         _SPI_execute [11]
                0.00  410.53     190/201         _SPI_execute_plan [4]
[2]     73.6    0.00  434.30     201         _SPI_pquery [2]
              162.25  272.05     190/228         ExecutorRun <cycle 2> [122]
                0.00    0.00     190/5731956     ExecutorStart [8]
                0.00    0.00     189/5731953     ExecutorEnd [38]
                0.00    0.00     127/130         _SPI_checktuples [4917]
-----------------------------------------------
                                                 <spontaneous>
[3]     69.6    0.00  410.54                 SPI_execp [3]
                0.00  410.53     190/190         _SPI_execute_plan [4]
                0.00    0.00     190/212         _SPI_copy_plan [365]
                0.00    0.00     189/203         _SPI_end_call [439]
                0.00    0.00     190/215         _SPI_begin_call [788]
-----------------------------------------------
                0.00  410.53     190/190         SPI_execp [3]
[4]     69.6    0.00  410.53     190         _SPI_execute_plan [4]
                0.00  410.53     190/201         _SPI_pquery [2]
                0.00    0.00     379/395         CommandCounterIncrement [454]
                0.00    0.00     190/441         CreateExecutorState [443]
                0.00    0.00     190/389         CreateQueryDesc [535]
                0.00    0.00     189/155883526     MemoryContextAlloc [21]

--
greg

Re: not exactly a bug report, but surprising behaviour

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Well, yes but I wasn't sure what to look for. It sort of looks to me like
> gprof chose an odd place to root the recursive cycle which makes the result
> confusing. At least to me.

_SPI_execute_plan?  I thought you said this was a simple SELECT...ORDER BY
query.

            regards, tom lane