Re: [HACKERS] A Better External Sort? - Mailing list pgsql-performance

From Pailloncy Jean-Gerard
Subject Re: [HACKERS] A Better External Sort?
Date
Msg-id 5A8CC216-2E25-4B8C-910A-69224D0A66CE@rilk.com
Whole thread Raw
In response to Re: [HACKERS] A Better External Sort?  (Ron Peacetree <rjpeace@earthlink.net>)
Responses Re: [HACKERS] A Better External Sort?  (PFC <lists@boutiquenumerique.com>)
List pgsql-performance
>> Your main example seems to focus on a large table where a key
>> column has
>> constrained values.  This case is interesting in proportion to the
>> number of possible values.  If I have billions of rows, each
>> having one
>> of only two values, I can think of a trivial and very fast method of
>> returning the table "sorted" by that key: make two sequential passes,
>> returning the first value on the first pass and the second value
>> on the
>> second pass.  This will be faster than the method you propose.
>>
>>
> 1= No that was not my main example.  It was the simplest example
> used to
> frame the later more complicated examples.  Please don't get hung
> up on it.
>
> 2= You are incorrect.  Since IO is the most expensive operation we
> can do,
> any method that makes two passes through the data at top scanning
> speed
> will take at least 2x as long as any method that only takes one
> such pass.
You do not get the point.
As the time you get the sorted references to the tuples, you need to
fetch the tuples themself, check their visbility, etc. and returns
them to the client.

So,
if there is only 2 values in the column of big table that is larger
than available RAM,
two seq scans of the table without any sorting
is the fastest solution.

Cordialement,
Jean-Gérard Pailloncy


pgsql-performance by date:

Previous
From: PFC
Date:
Subject: Re: Comparative performance
Next
From: Joe
Date:
Subject: Re: Comparative performance