Thread: How to keep Postgres from thrashing for a long time using "LIMIT" (or is there a better way?)!
How to keep Postgres from thrashing for a long time using "LIMIT" (or is there a better way?)!
From
"Nick Fankhauser"
Date:
We've got a rather large table and a situation where ours users might do a query that essentially says "give me everything", or give me 20,000 rows... We want to limit this so that a reasonable number of hits are returned in a reasonable length of time. We're using "LIMIT" to do this, but it appears that Postgres is going out & retrieving everything first and THEN applying the limit. (An EXPLAIN of the query with & without LIMIT bears this theory out...) So my question is, are we using LIMIT in the wrong way, or is there a better way to achieve our purpose? We're using JDBC to submit the query via Java, and The query looks like this: select court_config.court_location_text,actor_case_assignment.case_id,actor_person_ date_of_birth,assigned_case_role,actor_case_assignment.court_ori,actor.actor _full_name,actor_case_assignment.actor_id,case_data.local_type_code,case_dat a.local_subtype_code,actor_case_assignment.impound_litigant_data,actor.actor _alias_for_actor_id from court_config,actor_case_assignment,actor,case_data where ( court_config.court_ori like 'IL' or court_config.court_address_state like 'IL' ) and court_config.court_ori = actor_case_assignment.court_ori and (actor.actor_id = actor_case_assignment.actor_id or actor.actor_alias_for_actor_id = actor_case_assignment.actor_id) and court_config.court_ori = actor_case_assignment.court_ori and case_data.case_id = actor_case_assignment.case_id order by case_id limit 200,2000; TIA for any help! -I'll try to return the favor some day! -Nick --------------------------------------------------------------------- Nick Fankhauser Business: nickf@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co. Software Consulting Services http://www.ontko.com/ Personal: nickf@fankhausers.com http://www.fankhausers.com
Re: How to keep Postgres from thrashing for a long time using "LIMIT" (or is there a better way?)!
From
Tom Lane
Date:
"Nick Fankhauser" <nickf@ontko.com> writes: > We're using "LIMIT" to do this, but it appears that Postgres is going out & > retrieving everything first and THEN applying the limit. (An EXPLAIN of the > query with & without LIMIT bears this theory out...) > select ... > order by case_id limit 200,2000; Do you have indexes on the case_id fields? If so, try specifying the ORDER BY as an ORDER BY one or the other input case_id fields (ie, qualify the name), rather than the output case_id field. The only way to produce sorted output in the general case is to form the whole query result, sort it, then return just the requested rows. As a moment's thought will show, it's impossible for a sort step to produce any output until it's examined all its input. For a query that orders by a table column that has an index, a possible query plan is to scan the table via the index until 2200 rows have been produced. For a sufficiently small limit, I'd expect that plan to be chosen. Hard to tell how small is sufficiently small, however, especially given the complexity of the query. You could try experimenting with SET ENABLE_SORT = OFF to discourage the planner from using an explicit sort, too. Not clear whether you'd get a better plan that way or not, but it's worth trying. I'd be interested to hear your results --- both the EXPLAIN output for different possibilities, and the actual timings. BTW, I hope you are using 7.0.something. 6.5 and before didn't have any ability to adjust the plan depending on presence of a LIMIT. regards, tom lane