Re: Sorting Improvements for 8.4 - Mailing list pgsql-hackers

From Mark Mielke
Subject Re: Sorting Improvements for 8.4
Date
Msg-id 476982F3.8040802@mark.mielke.cc
Whole thread Raw
In response to Re: Sorting Improvements for 8.4  (Ron Mayer <rm_pg@cheapcomplexdevices.com>)
List pgsql-hackers
Ron Mayer wrote: <blockquote cite="mid:47696249.4090602@cheapcomplexdevices.com" type="cite"><pre wrap="">The link in
thebeginning of the thread points to articles
 
that seem to describe one such algorithm; along with benchmarks.
(<a class="moz-txt-link-freetext" href="http://tinyurl.com/3bvu4u">http://tinyurl.com/3bvu4u</a>, <a
class="moz-txt-link-freetext"href="http://tinyurl.com/32wg2m">http://tinyurl.com/32wg2m</a>)
 
The improvements were pretty consistent from set sizes ranging
from very small sets (hundreds) to quite large ones (hundreds of K).

Interestingly, even multi-threading helped a lot.
  "Our tests correlate well with previous research that showed   Intel’s implementation of SMT (Hyper-Threading) to be
adept at hiding this latency [6, 20, 12].Table 4 shows that by   having two threads access memory at the same time,
performance  improved over 80% when compared to the singlethreaded version.
 

It uses both quicksort phases and merge phases; for the merge phase
using 2CPUs (no hyperthreading) apparently gave more than 2X speed
improvement; apparently because it could parallelize memory access
with CPU more. </pre></blockquote> Good points. I had forgotten about DDR and DDR2 having high throughput at the cost
ofhigh latency. Somewhere in there, having the most number of memory requests in the queue would allow hardware to
eliminatethis high latency effect.<br /><br /><blockquote cite="mid:47696249.4090602@cheapcomplexdevices.com"
type="cite"><blockquotetype="cite"><pre wrap="">Or do you mean being able to perform parts of the query plan fully in
 
parallel? If this, then one would need a lot more than ParallelSort...   </pre></blockquote><pre wrap="">I wouldn't
recommendthat - it seems like a Hard Problem.
 

My guess is that the best way to use multiple threads in one backend
would be to find specific algorithms like sorting that  would be
easier to isolate. </pre></blockquote> Also a good point. :-)<br /><br /> Cheers,<br /> mark<br /><br /><pre
class="moz-signature"cols="72">-- 
 
Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a>
</pre>

pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Proposal for Null Bitmap Optimization(for TrailingNULLs)
Next
From: Mark Mielke
Date:
Subject: Re: Sorting Improvements for 8.4