On Wed, 2006-11-22 at 11:17 +0100, Markus Schaber wrote:
> PostgreSQL 8.1 (and, back then, 7.4) have the tendency to underestimate
> the costs of sort operations, compared to index scans.
>
> The Backend allocates gigs of memory (we've set sort_mem to 1 gig), and
> then starts spilling out more Gigs of temporary data to the disk. So the
> execution gets - in the end - much slower compared to an index scan, and
> wastes lots of disk space.
>
> We did not manage to tune the config values appropriately, at least not
> without causing other query plans to suffer badly.
8.2 has substantial changes to sort code, so you may want to give the
beta version a try to check for how much better it works. That's another
way of saying that sort in 8.1 and before has some performance problems
when you are sorting more than 6 * 2 * work_mem (on randomly sorted
data) and the cost model doesn't get this right, as you observe.
Try enabling trace_sort (available in both 8.1 and 8.2) and post the
results here please, which would be very useful to have results on such
a large real-world sort.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com