Home > mailing lists

Re: Various performance questions - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Various performance questions
Date	October 27, 2003 15:27:00
Msg-id	2167.1067282767@sss.pgh.pa.us Whole thread Raw
In response to	Re: Various performance questions (Greg Stark <gsstark@mit.edu>)
List	pgsql-performance

Tree view

Greg Stark <gsstark@mit.edu> writes:
> I'm still puzzled why the times on these are so different when the latter
> returns fewer records and both are doing sequential scans:

My best guess is that it's simply the per-tuple overhead of cycling
tuples through the two plan nodes.  When you have no actual I/O happening,
the seqscan runtime is going to be all CPU time, something of the form
    cost_per_page * number_of_pages_processed +
    cost_per_tuple_scanned * number_of_tuples_scanned +
    cost_per_tuple_returned * number_of_tuples_returned
I don't have numbers for the relative sizes of those three costs, but
I doubt that any of them are negligible compared to the other two.

Adding a WHERE clause increases cost_per_tuple_scanned but reduces the
number_of_tuples_returned, and so it cuts the contribution from the
third term, evidently by more than the WHERE clause adds to the second
term.

Ny own profiling had suggested that the cost-per-tuple-scanned in the
aggregate node dominated the seqscan CPU costs, but that might be
platform-specific, or possibly have something to do with the fact that
I was profiling an assert-enabled build.

It might be worth pointing out that EXPLAIN ANALYZE adds two kernel
calls (gettimeofday or some such) into each cycle of the plan nodes;
that's probably inflating the cost_per_tuple_returned by a noticeable
amount.

            regards, tom lane

pgsql-performance by date:

From: Greg Stark
Date: 27 October 2003, 15:23:32
Subject: Re: Various performance questions

From: Tom Lane
Date: 27 October 2003, 16:12:46
Subject: Re: Very Poor Insert Performance

Re: Various performance questions - Mailing list pgsql-performance

Previous

Next