Thread: How to keep Postgres from thrashing for a long time using "LIMIT" (or is there a better way?)!

We've got a rather large table and a situation where ours users might do a
query that essentially says "give me everything", or give me 20,000 rows...
We want to limit this so that a reasonable number of hits are returned in a
reasonable length of time.

We're using "LIMIT" to do this, but it appears that Postgres is going out &
retrieving everything first and THEN applying the limit. (An EXPLAIN of the
query with & without LIMIT bears this theory out...)

So my question is, are we using LIMIT in the wrong way, or is there a better
way to achieve our purpose?

We're using JDBC to submit the query via Java, and The query looks like
this:

select
court_config.court_location_text,actor_case_assignment.case_id,actor_person_
date_of_birth,assigned_case_role,actor_case_assignment.court_ori,actor.actor
_full_name,actor_case_assignment.actor_id,case_data.local_type_code,case_dat
a.local_subtype_code,actor_case_assignment.impound_litigant_data,actor.actor
_alias_for_actor_id from court_config,actor_case_assignment,actor,case_data
where ( court_config.court_ori like 'IL'  or
court_config.court_address_state like 'IL'  ) and court_config.court_ori =
actor_case_assignment.court_ori and (actor.actor_id =
actor_case_assignment.actor_id or actor.actor_alias_for_actor_id =
actor_case_assignment.actor_id) and court_config.court_ori =
actor_case_assignment.court_ori and case_data.case_id =
actor_case_assignment.case_id order by case_id  limit 200,2000;

TIA for any help! -I'll try to return the favor some day!

-Nick
---------------------------------------------------------------------
Nick Fankhauser

Business:
nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
Ray Ontko & Co.  Software Consulting Services  http://www.ontko.com/

Personal:
nickf@fankhausers.com   http://www.fankhausers.com


"Nick Fankhauser" <nickf@ontko.com> writes:
> We're using "LIMIT" to do this, but it appears that Postgres is going out &
> retrieving everything first and THEN applying the limit. (An EXPLAIN of the
> query with & without LIMIT bears this theory out...)

> select ...
> order by case_id  limit 200,2000;

Do you have indexes on the case_id fields?  If so, try specifying the
ORDER BY as an ORDER BY one or the other input case_id fields (ie,
qualify the name), rather than the output case_id field.

The only way to produce sorted output in the general case is to form
the whole query result, sort it, then return just the requested rows.
As a moment's thought will show, it's impossible for a sort step to
produce any output until it's examined all its input.

For a query that orders by a table column that has an index, a possible
query plan is to scan the table via the index until 2200 rows have been
produced.  For a sufficiently small limit, I'd expect that plan to be
chosen.  Hard to tell how small is sufficiently small, however,
especially given the complexity of the query.

You could try experimenting with SET ENABLE_SORT = OFF to discourage
the planner from using an explicit sort, too.  Not clear whether you'd
get a better plan that way or not, but it's worth trying.

I'd be interested to hear your results --- both the EXPLAIN output for
different possibilities, and the actual timings.

BTW, I hope you are using 7.0.something.  6.5 and before didn't have
any ability to adjust the plan depending on presence of a LIMIT.

            regards, tom lane