Thread: MemoryContextSwitchTo during table scan?

From:
"Jignesh Shah"
Date:

Hello,

I am running PostgreSQL 8.0.x  on Solaris 10 AMD64. My Tablesize for this
test is about 80G. When I run a query on a column which is not indexed, I
get a full table scan query and that's what I am testing right now. (I am
artificially creating that scenario to improve that corner case).  Aparently
I find that the full query is running much slower compared to what hardware
can support and hence dug into DTrace to figure out where it is spending
most of its time.

Running a script (available on my blog) I find the following top 5 functions
where it spends most time during a 10 second run of the script
<PRE>
                                                    Time in (millisec)
    Call Count
MemoryContextSwitchTo                   775                           106564
LockBuffer                                        707
    109367
LWLockAcquire                                 440
  58888
ExecEvalConst                                  418
    53282
ResourceOwnerRememberBuffer      400                              54684
TransactionIdFollowsOrEquals           392
53281

</PRE>

While the times look pretty small (0.775 second out of 10seconds which is
about 7.75%), it still represents significant time since the table is pretty
big and the entire scan takes about 30 minute (about 80G big table).
Considering it is a single threaded single process scan all the hits of the
function calls itself can delay the performance.

MemoryContextSwitchTo and LockBuffer itself takes 15% of the total time of
the query. I was expecting "read" to be the slowest part (biggest component)
but it was way down in the 0.4% level.

Now the question is why there are so many calls to MemoryContextSwitchTo in
a single SELECT query command? Can it be minimized?

Also is there any way to optimize LockBuffer?

Is there anything else that can minimize the time spent in these calls
itself? (Of course it is the first iteration but something else will be the
bottleneck... but that's the goal).

If there are any hackers interested in tackling this problem let me know.

Thanks.
Regards,
Jignesh



From:
Tom Lane
Date:

"Jignesh Shah" <> writes:
> Running a script (available on my blog) I find the following top 5 functions
> where it spends most time during a 10 second run of the script

It's pretty risky to draw conclusions from only 10 seconds' worth of
gprof data --- that's only 1000 samples total at the common sampling
rate of 100/sec.  If there's one function eating 90% of the runtime,
you'll find out, but you don't have enough data to believe that you
know what is happening with resolution of a percent or so.  I generally
try to accumulate several minutes worth of CPU time in a gprof run.

> MemoryContextSwitchTo and LockBuffer itself takes 15% of the total time of
> the query. I was expecting "read" to be the slowest part (biggest component)
> but it was way down in the 0.4% level.

You do know that gprof counts only CPU time, and only user-space CPU
time at that?  read() isn't going to show up at all.  It's fairly likely
that your test case is I/O bound and that worrying about CPU efficiency
for it is a waste of time anyway.

            regards, tom lane

From:
"Luke Lonergan"
Date:

Tom,

On 8/22/05 8:41 AM, "Tom Lane" <> wrote:

>> MemoryContextSwitchTo and LockBuffer itself takes 15% of the total time of
>> the query. I was expecting "read" to be the slowest part (biggest component)
>> but it was way down in the 0.4% level.
>
> You do know that gprof counts only CPU time, and only user-space CPU
> time at that?  read() isn't going to show up at all.  It's fairly likely
> that your test case is I/O bound and that worrying about CPU efficiency
> for it is a waste of time anyway.

He's running DTRACE, a CPU profiler that uses hardware performance
registers, not gprof.  BTW, if you statically link your app, you get
profiling information for system calls with gprof.

Jignesh has been analyzing PG for quite a while, there are definite issues
with CPU consuming functions in the data path IMO.  This result he reported
is one of them on Solaris 10, and possibly on other platforms.

We are limited to about 130MB/s of I/O throughput for sequential scan on
platforms that can do 240MB/s of sequential 8k block I/O.  Though I haven't
profiled seqscan, I'd look at Jignesh's results carefully because they
correlate with our experience.

- Luke



From:
Josh Berkus
Date:

Jignesh,

> Also is there any way to optimize LockBuffer?

Yes, test on 8.1.   The buffer manager was re-written for 8.1.   You should
see a decrease in both LockBuffer and context switch activity.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

From:
"Jignesh Shah"
Date:

Hi Tom,

Like I mentioned I am using DTrace on Solaris 10 x64 and not gprof.
DTrace is not based on sampling but actual entry/exit point. Ofcourse my 10
second profile is just a sample that I can assure you is representative of
the query since it is a very simple query that does simple table scan. (I am
taken profiles at different times of the queries all giving similar outputs)

In case of DTrace I am actually measuring "wall clock" for leaf functions.

For more information on DTrace please refer to:
http://docs.sun.com/app/docs/doc/817-6223/6mlkidlf1?a=view

Regards,
Jignesh


----Original Message Follows----
From: Tom Lane <>
To: "Jignesh Shah" <>
CC: 
Subject: Re: [PERFORM] MemoryContextSwitchTo during table scan?
Date: Mon, 22 Aug 2005 11:41:40 -0400

"Jignesh Shah" <> writes:
 > Running a script (available on my blog) I find the following top 5
functions
 > where it spends most time during a 10 second run of the script

It's pretty risky to draw conclusions from only 10 seconds' worth of
gprof data --- that's only 1000 samples total at the common sampling
rate of 100/sec.  If there's one function eating 90% of the runtime,
you'll find out, but you don't have enough data to believe that you
know what is happening with resolution of a percent or so.  I generally
try to accumulate several minutes worth of CPU time in a gprof run.

 > MemoryContextSwitchTo and LockBuffer itself takes 15% of the total time
of
 > the query. I was expecting "read" to be the slowest part (biggest
component)
 > but it was way down in the 0.4% level.

You do know that gprof counts only CPU time, and only user-space CPU
time at that?  read() isn't going to show up at all.  It's fairly likely
that your test case is I/O bound and that worrying about CPU efficiency
for it is a waste of time anyway.

            regards, tom lane



From:
Neil Conway
Date:

Jignesh Shah wrote:
> Now the question is why there are so many calls to MemoryContextSwitchTo
> in a single SELECT query command? Can it be minimized?

I agree with Tom -- if profiling indicates that MemoryContextSwitchTo()
is the bottleneck, I would be suspicious that your profiling setup is
misconfigured. MemoryContextSwitchTo() is essentially a function call,
two pointer assignments, and a function return. Try rerunning the test
with current sources -- MemoryContextSwitchTo() is now inlined when
using GCC, which should just leave the assignments.

-Neil