Re: LIMIT/SORT optimization - Mailing list pgsql-patches

From Simon Riggs
Subject Re: LIMIT/SORT optimization
Date
Msg-id 1175984587.3623.594.camel@silverbirch.site
Whole thread Raw
In response to Re: LIMIT/SORT optimization  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
On Sat, 2007-04-07 at 14:11 -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > I did some performance testing of the patch, and the results were good.
> > I did this:
>
> >     test=> CREATE TABLE test (x INTEGER);
> >     test=> INSERT INTO test SELECT * FROM generate_series(1, 1000000);
> >     test=> SET log_min_duration_statement = 0;
> >     test=> SELECT * FROM test ORDER BY x LIMIT 3;
>
> LIMIT 3 seems an awfully favorable case; if the patch can only manage a
> factor of 4 speedup there, what happens at limit 10, 20, 100?  Also,
> you've tested only one sort size and only one (unspecified) value of
> work_mem, and the usefulness of the patch would surely vary depending on
> that.  In particular, what happens with a LIMIT large enough to overflow
> work_mem?

Yeh, this is really designed to improve the case where we retrieve a
"screenfull" of data. i.e. 25, 50 or 100 records. Or worst case 10
screenfulls.

The code deliberately doesn't use an insertion sort for that reason,
since that is beyond the cut-off where that works best. So it should be
optimised for medium numbers of rows when no index is present.

The use case is important because we want to be able to populate data
for screens in a reasonably bounded time, not one that gets suddenly
worse should the number of possible matches exceed work_mem. [Think how
well Google reacts to varying numbers of candidate matches] Whatever
happens with LIMIT > work_mem doesn't fit the use case, so as long as it
is no slower than what we have now, that should be fine.

> Lastly, I suspect that sorting presorted input might be particularly
> favorable for this patch.  Please try it with random data for comparison.

Agreed.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Heap page diagnostic/test functions (v2)
Next
From: Bruce Momjian
Date:
Subject: Re: simply custom variables protection