On Tue, 2005-04-12 at 10:04 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > Could anybody comment on whether the current tests appropriately cover
> > the correctness of the external sorting algorithms?
>
> It's highly unlikely that the regression tests stress external sorts
> much, or that anyone would hold still for making them run long enough
> to do so ;-)
OK
> It's not hard to create a stress test: just load a bunch of random
> numbers into a table and create a b-tree index on it. To check the
> correctness of the sort, you could CLUSTER on the index and then read
> out the table to see if it were now in sorted order.
Just checking. No point starting anything until a test is in place. Yes,
they're fairly straightforward to do - I just didn't want to do it...
> BTW, as for your original question about performance, the current
> external sort algorithm is mainly designed to conserve disk space,
> not to be as fast as possible. It could probably be a good bit faster
> if we didn't mind taking twice as much space (mainly because the
> physical disk access pattern would be a lot less random). But I know
> we will get push-back if we try to revert to doing that.
That's roughly what I'm looking into now: just scoping for the time
being. Anything submitted would take the status quo as default and
present other functionality as an option only.
There's also some research into improved replacement selection
algorithms that may soon be submitted/submittable.
Best Regards, Simon Riggs