Thread: MemoryContextSwitchTo during table scan?
Hello, I am running PostgreSQL 8.0.x on Solaris 10 AMD64. My Tablesize for this test is about 80G. When I run a query on a column which is not indexed, I get a full table scan query and that's what I am testing right now. (I am artificially creating that scenario to improve that corner case). Aparently I find that the full query is running much slower compared to what hardware can support and hence dug into DTrace to figure out where it is spending most of its time. Running a script (available on my blog) I find the following top 5 functions where it spends most time during a 10 second run of the script <PRE> Time in (millisec) Call Count MemoryContextSwitchTo 775 106564 LockBuffer 707 109367 LWLockAcquire 440 58888 ExecEvalConst 418 53282 ResourceOwnerRememberBuffer 400 54684 TransactionIdFollowsOrEquals 392 53281 </PRE> While the times look pretty small (0.775 second out of 10seconds which is about 7.75%), it still represents significant time since the table is pretty big and the entire scan takes about 30 minute (about 80G big table). Considering it is a single threaded single process scan all the hits of the function calls itself can delay the performance. MemoryContextSwitchTo and LockBuffer itself takes 15% of the total time of the query. I was expecting "read" to be the slowest part (biggest component) but it was way down in the 0.4% level. Now the question is why there are so many calls to MemoryContextSwitchTo in a single SELECT query command? Can it be minimized? Also is there any way to optimize LockBuffer? Is there anything else that can minimize the time spent in these calls itself? (Of course it is the first iteration but something else will be the bottleneck... but that's the goal). If there are any hackers interested in tackling this problem let me know. Thanks. Regards, Jignesh
"Jignesh Shah" <jigneshk@hotmail.com> writes: > Running a script (available on my blog) I find the following top 5 functions > where it spends most time during a 10 second run of the script It's pretty risky to draw conclusions from only 10 seconds' worth of gprof data --- that's only 1000 samples total at the common sampling rate of 100/sec. If there's one function eating 90% of the runtime, you'll find out, but you don't have enough data to believe that you know what is happening with resolution of a percent or so. I generally try to accumulate several minutes worth of CPU time in a gprof run. > MemoryContextSwitchTo and LockBuffer itself takes 15% of the total time of > the query. I was expecting "read" to be the slowest part (biggest component) > but it was way down in the 0.4% level. You do know that gprof counts only CPU time, and only user-space CPU time at that? read() isn't going to show up at all. It's fairly likely that your test case is I/O bound and that worrying about CPU efficiency for it is a waste of time anyway. regards, tom lane
Tom, On 8/22/05 8:41 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: >> MemoryContextSwitchTo and LockBuffer itself takes 15% of the total time of >> the query. I was expecting "read" to be the slowest part (biggest component) >> but it was way down in the 0.4% level. > > You do know that gprof counts only CPU time, and only user-space CPU > time at that? read() isn't going to show up at all. It's fairly likely > that your test case is I/O bound and that worrying about CPU efficiency > for it is a waste of time anyway. He's running DTRACE, a CPU profiler that uses hardware performance registers, not gprof. BTW, if you statically link your app, you get profiling information for system calls with gprof. Jignesh has been analyzing PG for quite a while, there are definite issues with CPU consuming functions in the data path IMO. This result he reported is one of them on Solaris 10, and possibly on other platforms. We are limited to about 130MB/s of I/O throughput for sequential scan on platforms that can do 240MB/s of sequential 8k block I/O. Though I haven't profiled seqscan, I'd look at Jignesh's results carefully because they correlate with our experience. - Luke
Jignesh, > Also is there any way to optimize LockBuffer? Yes, test on 8.1. The buffer manager was re-written for 8.1. You should see a decrease in both LockBuffer and context switch activity. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Hi Tom, Like I mentioned I am using DTrace on Solaris 10 x64 and not gprof. DTrace is not based on sampling but actual entry/exit point. Ofcourse my 10 second profile is just a sample that I can assure you is representative of the query since it is a very simple query that does simple table scan. (I am taken profiles at different times of the queries all giving similar outputs) In case of DTrace I am actually measuring "wall clock" for leaf functions. For more information on DTrace please refer to: http://docs.sun.com/app/docs/doc/817-6223/6mlkidlf1?a=view Regards, Jignesh ----Original Message Follows---- From: Tom Lane <tgl@sss.pgh.pa.us> To: "Jignesh Shah" <jigneshk@hotmail.com> CC: pgsql-performance@postgresql.org Subject: Re: [PERFORM] MemoryContextSwitchTo during table scan? Date: Mon, 22 Aug 2005 11:41:40 -0400 "Jignesh Shah" <jigneshk@hotmail.com> writes: > Running a script (available on my blog) I find the following top 5 functions > where it spends most time during a 10 second run of the script It's pretty risky to draw conclusions from only 10 seconds' worth of gprof data --- that's only 1000 samples total at the common sampling rate of 100/sec. If there's one function eating 90% of the runtime, you'll find out, but you don't have enough data to believe that you know what is happening with resolution of a percent or so. I generally try to accumulate several minutes worth of CPU time in a gprof run. > MemoryContextSwitchTo and LockBuffer itself takes 15% of the total time of > the query. I was expecting "read" to be the slowest part (biggest component) > but it was way down in the 0.4% level. You do know that gprof counts only CPU time, and only user-space CPU time at that? read() isn't going to show up at all. It's fairly likely that your test case is I/O bound and that worrying about CPU efficiency for it is a waste of time anyway. regards, tom lane
Jignesh Shah wrote: > Now the question is why there are so many calls to MemoryContextSwitchTo > in a single SELECT query command? Can it be minimized? I agree with Tom -- if profiling indicates that MemoryContextSwitchTo() is the bottleneck, I would be suspicious that your profiling setup is misconfigured. MemoryContextSwitchTo() is essentially a function call, two pointer assignments, and a function return. Try rerunning the test with current sources -- MemoryContextSwitchTo() is now inlined when using GCC, which should just leave the assignments. -Neil