Thread: Re: External Sort timing debug statements
On Sun, 2005-10-02 at 19:43 +0100, Simon Riggs wrote: > The following patch implements a fairly light set of timing statements > aimed at understanding external sort performance. There is no attempt to > alter the algorithms. Minor update of patch, use this version please. > Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: >> The following patch implements a fairly light set of timing statements >> aimed at understanding external sort performance. There is no attempt to >> alter the algorithms. > Minor update of patch, use this version please. Applied with revisions: I made it use the VacRUsage code so that we could see both CPU and elapsed time, and moved the report points around a bit. The output with trace_sort enabled looks like this: NOTICE: begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = t NOTICE: switching to external sort: CPU 0.05s/0.10u sec elapsed 0.15 sec NOTICE: finished writing run 1: CPU 0.14s/0.83u sec elapsed 0.99 sec NOTICE: finished writing run 2: CPU 0.25s/1.67u sec elapsed 1.94 sec NOTICE: finished writing run 3: CPU 0.37s/2.51u sec elapsed 2.90 sec NOTICE: finished writing run 4: CPU 0.48s/3.36u sec elapsed 3.86 sec ... NOTICE: finished writing run 45: CPU 5.06s/38.26u sec elapsed 43.55 sec NOTICE: performsort starting: CPU 5.10s/38.62u sec elapsed 43.95 sec NOTICE: finished writing run 46: CPU 5.11s/38.84u sec elapsed 44.18 sec NOTICE: finished writing final run 47: CPU 5.11s/38.88u sec elapsed 44.22 sec NOTICE: finished merge step: CPU 5.12s/39.02u sec elapsed 44.37 sec NOTICE: finished merge step: CPU 5.13s/39.16u sec elapsed 44.53 sec ... NOTICE: finished merge step: CPU 6.57s/67.78u sec elapsed 74.83 sec NOTICE: performsort done: CPU 6.57s/67.78u sec elapsed 74.84 sec NOTICE: sort ended: CPU 10.80s/74.73u sec elapsed 86.21 sec regards, tom lane
On Mon, 2005-10-03 at 18:59 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > >> The following patch implements a fairly light set of timing statements > >> aimed at understanding external sort performance. There is no attempt to > >> alter the algorithms. > > > Minor update of patch, use this version please. > > Applied with revisions: I made it use the VacRUsage code so that we > could see both CPU and elapsed time, and moved the report points around > a bit. The output with trace_sort enabled looks like this: > > NOTICE: begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = t > NOTICE: switching to external sort: CPU 0.05s/0.10u sec elapsed 0.15 sec > NOTICE: finished writing run 1: CPU 0.14s/0.83u sec elapsed 0.99 sec > NOTICE: finished writing run 2: CPU 0.25s/1.67u sec elapsed 1.94 sec > NOTICE: finished writing run 3: CPU 0.37s/2.51u sec elapsed 2.90 sec > NOTICE: finished writing run 4: CPU 0.48s/3.36u sec elapsed 3.86 sec > ... > NOTICE: finished writing run 45: CPU 5.06s/38.26u sec elapsed 43.55 sec > NOTICE: performsort starting: CPU 5.10s/38.62u sec elapsed 43.95 sec > NOTICE: finished writing run 46: CPU 5.11s/38.84u sec elapsed 44.18 sec > NOTICE: finished writing final run 47: CPU 5.11s/38.88u sec elapsed 44.22 sec > NOTICE: finished merge step: CPU 5.12s/39.02u sec elapsed 44.37 sec > NOTICE: finished merge step: CPU 5.13s/39.16u sec elapsed 44.53 sec > ... > NOTICE: finished merge step: CPU 6.57s/67.78u sec elapsed 74.83 sec > NOTICE: performsort done: CPU 6.57s/67.78u sec elapsed 74.84 sec > NOTICE: sort ended: CPU 10.80s/74.73u sec elapsed 86.21 sec Many thanks, Best Regards, Simon Riggs
Tom Lane <tgl@sss.pgh.pa.us> writes: > Applied with revisions: I made it use the VacRUsage code so that we > could see both CPU and elapsed time, and moved the report points around > a bit. The output with trace_sort enabled looks like this: > > NOTICE: begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = t > NOTICE: switching to external sort: CPU 0.05s/0.10u sec elapsed 0.15 sec > NOTICE: finished writing run 1: CPU 0.14s/0.83u sec elapsed 0.99 sec > NOTICE: finished writing run 2: CPU 0.25s/1.67u sec elapsed 1.94 sec > NOTICE: finished writing run 3: CPU 0.37s/2.51u sec elapsed 2.90 sec > NOTICE: finished writing run 4: CPU 0.48s/3.36u sec elapsed 3.86 sec I'm not actually involved in this so maybe I'm completely off base here. But wouldn't you want to know how many tuples are being sorted and how many data are being written in these runs in order to be able to actually make sense of these timing measurements? Otherwise you would never know whether the results people will send you are really comparable. And even if you're testing things in a well controlled environments you'll never know if you missed something and you're comparing two incomparable results. -- greg
Greg Stark <gsstark@mit.edu> writes: > I'm not actually involved in this so maybe I'm completely off base here. But > wouldn't you want to know how many tuples are being sorted and how many data > are being written in these runs in order to be able to actually make sense of > these timing measurements? In all plausible test cases you should be able to know the total number of tuples being sorted independently. Counting the tuples processed in each run would require extra per-tuple overhead, which I'd rather avoid until proven necessary. The total-data-volume aspect may or may not be interesting, not sure yet. Let's see what we can learn from the present low-impact patch. regards, tom lane