Thread: OFFSET and LIMIT - performance

OFFSET and LIMIT - performance

From
"Jan Bilek"
Date:
Hello,
I'm using PGDB with JDBC. In my app i need to select only portion of all available rows. I know i can do it two ways:
1. I can use OFFSET and LIMIT SQL statements or
2. I can select all rows and then filter requested portion in Java.
 
My question - Does the second way significantly affect performance especially when used with JDBC?
Does the LIMIT statement do anything else then browsing/filtering rows with cursor (= the same thing i'd do in Java)?
How are the selected rows processed in select queries? Does PG select all rows and then filter them using statements like LIMIT and OFFSET or applies the statements while processing query?
 
Thanks for your analyzes!
 
JB

Re: OFFSET and LIMIT - performance

From
Tom Lane
Date:
"Jan Bilek" <bilekj@gmail.com> writes:
> I'm using PGDB with JDBC. In my app i need to select only portion of all =
> available rows. I know i can do it two ways:
> 1. I can use OFFSET and LIMIT SQL statements or
> 2. I can select all rows and then filter requested portion in Java.

> My question - Does the second way significantly affect performance =
> especially when used with JDBC?

Network transmission costs alone would make the second way a loser.

Large OFFSETs are pretty inefficient because the backend generates and
discards the rows internally ... but at least it never converts them to
external form or ships them to the client.  Rows beyond the LIMIT are
not generated at all.

            regards, tom lane

Re: OFFSET and LIMIT - performance

From
David Wall
Date:
> Network transmission costs alone would make the second way a loser.
>
> Large OFFSETs are pretty inefficient because the backend generates and
> discards the rows internally ... but at least it never converts them to
> external form or ships them to the client.  Rows beyond the LIMIT are
> not generated at all.
>
Some of this would depend on the query, too, I suspect, since an ORDER
BY would require the entire result set to be determined, sorted and then
the limit/offset could take place.  Regardless, it's better than
filtering in the Java/client side to avoid sending it from the database
backend to the client.

But how would that compare to using a cursor/fetch query.  It seems like
the JDBC library will automatically use a cursor if you specify some
params on the PreparedStatement, though the details escape me.  I think
it's related to setFetchSize() and/or setMaxRows().    Of course, those
are not guaranteed to do anything special either, and you'd still need
to retrieve and discard initial rows unless you can adjust your WHERE
condition to find the "next set".

If you have an ORDER BY on a unique field, for example, you could use
that field to query the next set by remembering the last value in your
previous query set (or select 1 more row than you need so you have the
exact value that would be next) and specifying it in the WHERE clause.
Even this could be an issue if updates would change the grouping.

LIMIT/OFFSET are not part of the SQL standard, too, should that matter
for DB portability.  I believe mysql supports it, but it seems like
Oracle didn't (at least at one time).

David


Re: OFFSET and LIMIT - performance

From
"Simon Riggs"
Date:
On Thu, 2007-06-28 at 11:36 -0700, David Wall wrote:
> > Network transmission costs alone would make the second way a loser.
> >
> > Large OFFSETs are pretty inefficient because the backend generates and
> > discards the rows internally ... but at least it never converts them to
> > external form or ships them to the client.  Rows beyond the LIMIT are
> > not generated at all.
> >
> Some of this would depend on the query, too, I suspect, since an ORDER
> BY would require the entire result set to be determined, sorted and then
> the limit/offset could take place.

In 8.3 a LIMIT clause will be evaluated at the same time as ORDER BY, so
that the full sort cost is avoided. This means that queries with LIMIT
are more likely to return in constant time, whether you have no ORDER
BY, an ORDER BY on an index, or an ORDER BY with no index. So indexes
specifically to provide a fast ORDER BY/LIMIT are no longer required.
Courtesy of Greg Stark.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: OFFSET and LIMIT - performance

From
Kaloyan Iliev
Date:
And what about using cursors and move. Which is faster - OFFSET/LIMIT OR
CURSOR/MOVE.

Best Regards,

   Kaloyan Iliev

Tom Lane wrote:

>"Jan Bilek" <bilekj@gmail.com> writes:
>
>
>>I'm using PGDB with JDBC. In my app i need to select only portion of all =
>>available rows. I know i can do it two ways:
>>1. I can use OFFSET and LIMIT SQL statements or
>>2. I can select all rows and then filter requested portion in Java.
>>
>>
>
>
>
>>My question - Does the second way significantly affect performance =
>>especially when used with JDBC?
>>
>>
>
>Network transmission costs alone would make the second way a loser.
>
>Large OFFSETs are pretty inefficient because the backend generates and
>discards the rows internally ... but at least it never converts them to
>external form or ships them to the client.  Rows beyond the LIMIT are
>not generated at all.
>
>            regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: Don't 'kill -9' the postmaster
>
>
>