Re: Sorting Improvements for 8.4 - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Sorting Improvements for 8.4
Date
Msg-id 1198088313.28804.387.camel@dogma.ljc.laika.com
Whole thread Raw
In response to Re: Sorting Improvements for 8.4  (Mark Mielke <mark@mark.mielke.cc>)
Responses Re: Sorting Improvements for 8.4  (Mark Mielke <mark@mark.mielke.cc>)
List pgsql-hackers
On Wed, 2007-12-19 at 12:08 -0500, Mark Mielke wrote:
> Stupid question #2: Is it well recognized that the CPU is the
> bottleneck in the PostgreSQL sorting mechanism? Or might it be memory
> bandwidth and I/O?
> 

I think it depends a lot on several factors. It's probably a different
bottleneck for integers versus localized text, and depends on the
available memory and I/O characteristics.

> 
> It would seem to me that any sort worth parallelizing (administrative
> and synchronization overhead), must have data larger than the L2
> cache. If larger than the L2 cache, it becomes real memory speed. If
> real memory speed, wouldn't one CPU without hardware synchronization,
> be able to fill the memory read/write pipe? 

We do an external merge sort, which involves merging M runs together.
You seem to be implying that we can generate the output run at disk
speed, and therefore the CPU speed is unimportant.

I suspect that the comparison costs are enough that the above statement
isn't true in all cases, particularly in the case of localized text. 

Also, there is probably a lot of memory copying going on, and that
probably destroys a lot of the effectiveness of L2 caching. When L2
caching is ineffective, the CPU spends a lot of time just waiting on
memory. In that case, it's better to have P threads of execution all
waiting on memory operations in parallel.

This would explain why "1p2t" would outperform a "1p1t" in Ron's
reference above.

These are just my first thoughts, however. There is a lot of existing
research out there that we can look into, and also a lot of tests that
we can run before jumping into this.

I think parallel sorting can be looked into separately from the other
sorting improvements.

Regards,Jeff Davis





pgsql-hackers by date:

Previous
From: "Gokulakannan Somasundaram"
Date:
Subject: Re: Proposal for Null Bitmap Optimization(for TrailingNULLs)
Next
From: Ron Mayer
Date:
Subject: Re: Sorting Improvements for 8.4