Thread: Everlasting SQL query
Hi everyone, I have a customer table (17518 records) and an orders table (88393 records). One of the columns of orders is customerid,containing the customerid (what else, but it is not a foreign key as this table is imported from a database thatdid not support foreign keys). If I do this query (with pgadmin III): select customer.id, customer.name, orders.id from customers, orders order by customer.id, orders.id limit 25 The query runs forever (the longest I let it run is 500 seconds). Explain gives me this (why 7 rows?): QUERY PLAN ------------------------------------------------------------------------------------- Limit (cost=722506879.16..722506879.22 rows=25 width=44) -> Sort (cost=722506879.16..726378050.59 rows=1548468574 width=44) Sort Key: klt_alg.klantnummer, orders.ordernummer -> Nested Loop (cost=9408.93..36288661.59 rows=1548468574 width=44) -> Seq Scan on klt_alg (cost=0.00..1927.18 rows=17518 width=40) -> Materialize (cost=9408.93..10595.86 rows=88393 width=4) -> Seq Scan on orders (cost=0.00..9105.93 rows=88393 width=4) (7 rows) If I only order by customer.id or by orders.is the query return within a second. Can anyone give me a reason why this all happens? Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 e-mail: J.Kraaijeveld@Askesis.nl web: www.askesis.nl
Joost, The spelling mistake from below will cause you an unwanted join with the "customers" table if you have one, which is unconstrained by any where clause, so it will take a carthesian product of the rest and the customers table. If this spelling mistake is just in the mail, please disregard. Cheers, Csaba. > select customer.id, customer.name, orders.id > from customers, orders ^^^^^^^^^ > order by customer.id, orders.id > limit 25 >
Hi Joost. Joost Kraaijeveld wrote: > I have a customer table (17518 records) and an orders table (88393 > records). One of the columns of orders is customerid, containing the > customerid (what else, but it is not a foreign key as this table is > imported from a database that did not support foreign keys). > > If I do this query (with pgadmin III): > > select customer.id, customer.name, orders.id from customers, orders > order by customer.id, orders.id limit 25 > > The query runs forever (the longest I let it run is 500 seconds). No wonder. You are retrieving 1548468574 rows. You are trying to perform a JOIN, but without specifying which fields to join on. So the query works with cartesian product of these two table (all possible combinantions), which is 17518 * 88393 = 1548468574 rows. You want: select customer.id, customer.name, orders.id from customers JOIN orders ON customers.id=orders.customerid order by customer.id, orders.id limit 25 or alternatively: select customer.id, customer.name, orders.id from customers, orders where customers.id=orders.customerid order by customer.id, orders.id limit 25 I prefer the first notation, though. -- Michal Taborsky http://www.taborsky.cz
On Wed, 2004-07-28 at 12:08, Joost Kraaijeveld wrote: > Hi everyone, > > I have a customer table (17518 records) and an orders table (88393 records). One of the columns of orders is customerid,containing the customerid (what else, but it is not a foreign key as this table is imported from a database thatdid not support foreign keys). > > If I do this query (with pgadmin III): > > select customer.id, customer.name, orders.id > from customers, orders > order by customer.id, orders.id > limit 25 > > The query runs forever (the longest I let it run is 500 seconds). > [...] > If I only order by customer.id or by orders.is the query return within a second. > > Can anyone give me a reason why this all happens? This is an inner join without a where clause. It gives the crsoo product of 17518*88393 = 1548468574 results. If you order by just one, PG can manage to give you the first 25 results out of the 1.5 billion (!). If you order by both there's no other way than to (try) computing everything - which PG does. You most likely don't want this. Add a where clause: where order.customer_id = customer.id or something like that (I'm just guessing your scheme). Bye, Chris.
On Jul 28, 2004, at 7:08 PM, Joost Kraaijeveld wrote: > select customer.id, customer.name, orders.id > from customers, orders > order by customer.id, orders.id > limit 25 > > The query runs forever (the longest I let it run is 500 seconds). You have no join condition, so it's doing a full cartesian join (17518 x 88393 = 1,548,468,574 rows before the limit). Try this: select customer.id, customer.name, orders.id from customers c, orders o where c.id = o.customerid order by customer.id, orders.id limit 25 Michael Glaesemann grzm myrealbox com