Thread: a LEFT JOIN problem
Hi, I have the following tables: Product(id, title, price) Item(id, product_id, order_id, quantity) Order(id, user_id, amount, paid) User(id, name) What I want to achieve is a query on a specific Product based in its title. If the product has at least 1 order in which it appears, then return the Product and Order details, if the product has no paid order associated, then only return the Product fields. I have tried the following query: -- SELECT products.*, paid FROM "products" LEFT OUTER JOIN items ON products.id = items.product_id LEFT OUTER JOIN orders ON items.order_id = orders.id LEFT OUTER JOIN users ON orders.user_id = users.id WHERE (title = E'some-product' AND paid = 1 AND name = 'thomas') -- The problem with my query, is that if there are no paid associated orders, then the WHERE will drop every returned line that has paid = 0, therefore I don't get anything at all, but I would like to have at least the products field. Moreover, the "name" argument in the WHERE comes from the user logged in data. So if the user is not logged in, no fields are returned. Is that possible to achieve in only one query? What is the correct way of doing it?
On Sat, Oct 25, 2008 at 2:11 PM, Thomas <iamkenzo@gmail.com> wrote: > Hi, > > I have the following tables: > > Product(id, title, price) > Item(id, product_id, order_id, quantity) > Order(id, user_id, amount, paid) > User(id, name) > > What I want to achieve is a query on a specific Product based in its > title. If the product has at least 1 order in which it appears, then > return the Product and Order details, if the product has no paid order > associated, then only return the Product fields. > > I have tried the following query: > -- > SELECT products.*, paid FROM "products" > LEFT OUTER JOIN items ON products.id = items.product_id > LEFT OUTER JOIN orders ON items.order_id = orders.id > LEFT OUTER JOIN users ON orders.user_id = users.id > WHERE (title = E'some-product' AND paid = 1 AND name = 'thomas') > -- > > The problem with my query, is that if there are no paid associated > orders, then the WHERE will drop every returned line that has paid = > 0, therefore I don't get anything at all, but I would like to have at > least the products field. Put the filtering in your ON clauses. The WHERE clause is processed after the JOINs are done, and is eliminating your results. Without knowing your schema, this is just a guess.. SELECT products.*, paid FROM "products" LEFT OUTER JOIN items ON (products.id = items.product_id AND title = 'some-product') LEFT OUTER JOIN orders ON (items.order_id = orders.id AND paid = 1) LEFT OUTER JOIN users ON (orders.user_id = users.id AND name = 'thomas') > Moreover, the "name" argument in the WHERE comes from the user logged > in data. So if the user is not logged in, no fields are returned. I suspect your app will need to to run the query and compare that against the user(s) currently logged in. Hope this helps! Tony Wasson
Hi Tony, You nailed it! That's the piece of the puzzle I was missing. Moving the filtering inside the ON clause was the solution.
Unfortunately, I cannot use >= in the ON clause when making a search on a date range.
On Mon, Oct 27, 2008 at 10:47:13AM +0100, Thomas wrote: > Unfortunately, I cannot use >= in the ON clause when making a search > on a date range. The right-hand-side of an ON clause is a general expression; you can include (and I regularly do) arbitrary functions and operators. I'm probably missing something obvious; but if you included a little more detail in your message and preferably some context you'd get a more useful response. Sam
For some reason, I now can include the date range search in my ON (...) clause. However I would like to know if there is a limit to the number of conditions I can put. It seems that more than 2 conditions misses some records.
Here is the SQL I am working with: -- SELECT products.*, orders.response_code FROM "products" JOIN items ON products.id = items.product_id LEFT OUTER JOIN orders ON (items.order_id = orders.id AND orders.response_code = '0' AND orders.user_id = 2) WHERE (permalink = E'product-1' AND products.site_id = 1) LIMIT 1 -- An order has been paid when its response_code is 0. One problem I have is that, if the user has not placed an order for the product, than I still want the product fields to be returned so I can't put this condition in a WHERE. If a user has paid an order for that product, I want all fields from Product to be returned and I want also the response_code to be returned (it will obviously be 0). But in the current state of the query, even if there exists a paid order for the current user_id, then the query misses it and therefore never returns the response_code. What's wrong with it?
Ok I get the problem. It is the LIMIT 1 which was misleading me. If I remove this limit, I get many returned results, some where orders were paid, some where orders were not paid, therefore the LIMIT1 picks the first one, and by chance it lands on an unpaid order. Am I trying to achieve something which is not possible to do? Do I compulsory need to make 2 queries? One for the order only and one for the product only?
I have found a trick to fool the system: I use an ORDER BY response_code 0 ASC LIMIT 1 As unpaid orders receive a response_code > 0, then necessarily the first record has response_code of 0. However if more and more orders come into the equation, this means PgSQL will have to process more records, how is it optimized? I guess PgSQL will have to find all the records, than order them by response_code ASC, and then pick the first one. This is probably not a good practice? Should I go for 2 distinct queries?
On Thu, Oct 30, 2008 at 11:53:48PM +0100, Thomas wrote: > Here is the SQL I am working with: > SELECT products.*, orders.response_code FROM "products" JOIN items ON > products.id = items.product_id > LEFT OUTER JOIN orders ON (items.order_id = orders.id AND > orders.response_code = '0' AND orders.user_id = 2) WHERE (permalink = > E'product-1' AND products.site_id = 1) LIMIT 1 I find this formatting slightly unreadable, I find it much easier to read code when it's formatted nicely: SELECT p.*, o.response_code FROM products p INNER JOIN items i ON p.id = i.product_id LEFT JOIN orders o ON i.order_id = o.id AND o.user_id = 2 AND o.response_code = '0' WHERE p.permalink = 'product-1' AND p.site_id = 1 LIMIT 1; I'm guessing that "permalink" comes from the "products" relation, you don't specify anywhere. Also, is "response_code" of some string type, or is it a number? If you've not used subqueries, an option could be: SELECT *, ( SELECT MIN(response_code) FROM items i, orders o WHERE p.id = i.product_id AND i.order_id = o.id AND o.user_id = 2) AS response_code FROM products p WHERE permalink = 'product-1' AND site_id = 1; This will give you the minimum response code, as in later posts you seemed to want to know this. There are lots of options in SQL to write things differently, optimizing for different things as well as just plain doing something else. I'd also recommend going through a few SQL tutorials to get ideas of how to solve different problems. Sam