Re: Re : Solaris Performance - Profiling - Mailing list pgsql-general
From | Tom Lane |
---|---|
Subject | Re: Re : Solaris Performance - Profiling |
Date | |
Msg-id | 25091.1017763371@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Re : Solaris Performance - Profiling (Mark kirkwood <markir@slingshot.co.nz>) |
Responses |
Re: Re : Solaris Performance - Profiling (Solved)
|
List | pgsql-general |
Mark kirkwood <markir@slingshot.co.nz> writes: > I am a little concerned that the cpu time recored by the profiler seems > to be a bit on the short side at 4047.53 seconds (67 minutes), as > opposed to 119.2 for the backend itself. Are these two numbers supposed > to match closely ? (they did for shorter queries..) Hmm. Where exactly did you get those numbers from? I see 4118.54 sec as the total CPU accounted for in the profile. Some versions of gprof hide the time spent in the profiler subroutine, leading to discrepancies between the time accounted for in the profile and the actual process CPU time. However, yours doesn't seem to be one of them --- internal_mcount is right there. > Anyway in the hope that this captured output is interesting enough, here > it is...or rather here is some of it (biggish file), so I am leaving a > complete copy under : > http://homepages.slingshot.co.nz/~markir/tar/solaris/ > Let me know if further information is likely to be helpful. I have > profiled the same query (on the same dataset) on Freebsd and have the > output available at : > http://homepages.slingshot.co.nz/~markir/tar/freebsd/ Hmm. Assuming that the profile data is trustworthy and the queries are indeed the same (did you compare EXPLAIN output?), it seems that Solaris' problem is a spectacularly bad qsort() implementation. The FreeBSD profile shows: 1 qsort [29] 0.00 0.31 1/2 load_password_cache [44] 0.00 0.31 1/2 tuplesort_performsort [45] [29] 9.8 0.01 0.62 2+1 qsort [29] 0.01 0.62 148033/148033 qsort_comparetup [30] 1 qsort [29] where Solaris has 0.00 1387.23 1/2 load_password_cache [18] 0.00 1387.23 1/2 tuplesort_performsort [19] [4] 68.5 0.00 2774.46 2 qsort [4] 19.73 2754.58 1/1 qst [5] 0.01 0.14 88012/1680561053 qsort_comparetup [6] 0.00 0.00 1/61473 .umul [154] ----------------------------------------------- 19.73 2754.58 1/1 qsort [4] [5] 68.5 19.73 2754.58 1 qst [5] 167.74 2586.81 1680473041/1680561053 qsort_comparetup [6] 0.02 0.00 59630/59703 .udiv [114] 0.01 0.00 59630/61473 .umul [154] and all the rest of the top profile entries are explained by the fact that qsort_comparetup is called 1.68 billion times instead of 148K times. Can these really be the same dataset? Can Solaris' qsort really be that outstandingly incompetent? How many rows are actually being sorted here, anyway? It might be entertaining to snarf a qsort off the net (from glibc, perhaps) and link it into Postgres to see if you get better results. regards, tom lane
pgsql-general by date: