Re: [HACKERS] \dt and disk access - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [HACKERS] \dt and disk access
Date
Msg-id b09077595e18b016bacfeaf032bd6c84
Whole thread Raw
In response to [HACKERS] \dt and disk access  (Bruce Momjian <maillist@candle.pha.pa.us>)
List pgsql-hackers
>
> Bruce Momjian wrote:
> > It looks like psort()'s use of disk files to do the sorting is causing
> > the slowdown, with the multiple create's and unlink's.  I am going to
> > modify psort() to use memory buffers instead of disk files for sorting
> > small SELECT results.  psort() is only called by ExecSort, which is used
> > by ORDER BY's.
> >
> > Permission to put this in 6.1p1?
>
> IMHO sounds like a *bad* idea, since it might take a while to debug.
> Better to leave this as a feature for now (Postgres has had this
> forever?) and come up with something great for v6.2; I'd be interested
> in contributing to this if it would be helpful.

I think I agree.  Let's leave it out.  I initially thought it was a
FASTBULD btree issue, but I found it was unrelated.

>
> I don't know which Knuth algorithm the code uses, but some sort
> algorithms perform very poorly on data which happens to be already
> sorted. QuickSort, which comes on many Unix boxes (the qsort() call) is
> quickest on completely unordered/disordered data and its slowest
> performance is on already perfectly sorted data (I'm doing this from
> memory, but the facts may be mostly correct :).

It is from Knuth III, page 280.  There are references to Knuth pages and
diagrams all over the code.

The sort is meant for cases when you don't have enough memory to do the
sort, and the thrashing of doing all those random accesses can be slow.

Now why they would not implement a fast version for cases where the rows
WOULD fit in memory, I don't know.

I will wait for 6.1p1, and then apply the fix.  I plan to just use
qsort() on an array of tuple pointers if the size of the ORDER BY result
is under 256k bytes, which is only 1/2 of the default shared buffer
size.  I will run tests to see if there is major speed improvement at
that size.  If not, I will knock it down to 128k or 64k, just to be safe
that it will not swamp the shared pool on a busy system.  I may even key
the trigger value on the number of shared buffers allocated.  This has
got to be faster than what it does now.

The psort() code is designed for sorts that would clearly fill the
shared buffer pool and disk buffer pool.  It takes a multi-megabyte
ORDER BY to do that on my machine.

And I am using a Barracuda Ultra drive with tagged queueing doing 7
Mbytes/second transfers, so I am not testing this on a slow drive when I
report 0.23 seconds to sort one row.

- --
Bruce Momjian
maillist@candle.pha.pa.us

------------------------------

pgsql-hackers by date:

Previous
From: "Thomas G. Lockhart"
Date:
Subject: Re: [HACKERS] \dt and disk access
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] \dt and disk access\