Thread: left join is strange
Hello, I have 2 tables: CREATE TABLE products ( id SERIAL PRIMARY KEY, name VARCHAR(255) NOT NULL ); CREATE TABLE products_daily_compacted_views ( product INTEGER NOT NULL REFERENCES products, date DATE NOT NULL DEFAULT ('NOW'::TEXT)::DATE, count INTEGER NOT NULL ); The table products has 1785 rows, the table products_daily_compacted_views has 768 rows with date = current_date; I want to list all the products and the number of times each product has been viewed: SELECT p.id, p.name, COALESCE(v.count, 0) AS views FROM products p LEFT JOIN products_daily_compacted_views v ON p.id = v.product WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC The problem with this query is that it doesn't return all the products, instead of 1785 rows, it returns 1077 rows This modified query seems to be correct, it returns all the products... SELECT p.id, p.name, COALESCE(v.count, 0) AS views FROM products p LEFT JOIN products_daily_compacted_views v ON p.id = v.product AND v.date = current_date ORDER BY views DESC Could anybody explain to me why does this happen ? Thank you.
> Andrei Ivanov wrote: > > I want to list all the products and the number of times each > product has > been viewed: > > SELECT p.id, p.name, COALESCE(v.count, 0) AS views > FROM products p LEFT JOIN products_daily_compacted_views v ON > p.id = v.product > WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC > > The problem with this query is that it doesn't return all the > products, > instead of 1785 rows, it returns 1077 rows And that is exactly as it should be. You will get the left joined combination of p and v, but the filter in the where is applied afterwards on all those combinations. > > This modified query seems to be correct, it returns all the > products... > > SELECT p.id, p.name, COALESCE(v.count, 0) AS views > FROM products p LEFT JOIN products_daily_compacted_views v > ON p.id = v.product AND v.date = current_date > ORDER BY views DESC > > Could anybody explain to me why does this happen ? Here you apply your filter to the elements of v, before joining them to the elements of p. Best regards, Arjen
On Mon, 8 Dec 2003, Arjen van der Meijden wrote: > > Andrei Ivanov wrote: > > > > I want to list all the products and the number of times each > > product has > > been viewed: > > > > SELECT p.id, p.name, COALESCE(v.count, 0) AS views > > FROM products p LEFT JOIN products_daily_compacted_views v ON > > p.id = v.product > > WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC > > > > The problem with this query is that it doesn't return all the > > products, > > instead of 1785 rows, it returns 1077 rows > And that is exactly as it should be. > You will get the left joined combination of p and v, but the filter in > the where is applied afterwards on all those combinations. > I kinda figured that out, but still, being a left join, it should return all the rows in the table products, which I then filter with v.date = current_date OR v.date IS NULL. v.date has 3 possible values: current_date, some other date or NULL, if there is no corresponding row in products_daily_compacted_views for that product. I filter out only 1 value, and I still should get 1785 rows... > > > > This modified query seems to be correct, it returns all the > > products... > > > > SELECT p.id, p.name, COALESCE(v.count, 0) AS views > > FROM products p LEFT JOIN products_daily_compacted_views v > > ON p.id = v.product AND v.date = current_date > > ORDER BY views DESC > > > > Could anybody explain to me why does this happen ? > Here you apply your filter to the elements of v, before joining them to > the elements of p. > > Best regards, > > Arjen > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
> Andrei Ivanov wrote: > > On Mon, 8 Dec 2003, Arjen van der Meijden wrote: > > > > Andrei Ivanov wrote: > > > > > > I want to list all the products and the number of times each > > > product has > > > been viewed: > > > > > > SELECT p.id, p.name, COALESCE(v.count, 0) AS views > > > FROM products p LEFT JOIN products_daily_compacted_views v ON > > > p.id = v.product > > > WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC > > > > > > The problem with this query is that it doesn't return all the > > > products, > > > instead of 1785 rows, it returns 1077 rows > > And that is exactly as it should be. > > You will get the left joined combination of p and v, but > the filter in > > the where is applied afterwards on all those combinations. > > > > I kinda figured that out, but still, being a left join, it > should return > all the rows in the table products, which I then filter with > v.date = current_date OR v.date IS NULL. > > v.date has 3 possible values: current_date, some other date > or NULL, if > there is no corresponding row in > products_daily_compacted_views for that > product. > > I filter out only 1 value, and I still should get 1785 rows... No, you combine two table using a left join (and yes, you get 1785 rows from that left join), which then (after the joining) get filtered using your where. The values that have the current_date (which are probably none, since that is taken at the moment of the selection, not at the moment of the insert) or the NULL will get through, resulting in less than your 1785 rows. Regards, Arjen
Andrei Ivanov <andrei.ivanov@ines.ro> writes: > I kinda figured that out, but still, being a left join, it should return > all the rows in the table products, which I then filter with > v.date = current_date OR v.date IS NULL. > v.date has 3 possible values: current_date, some other date or NULL, if > there is no corresponding row in products_daily_compacted_views for that > product. Right. Your first query will show products for which (1) there is a v row with date = current_date, or (2) there is *no* v row at all. If there is a v row with the wrong date, it will get through the left join and then be eliminated at WHERE. Because it gets through the left join, no null-extended row is generated for that product, and so your OR v.date IS NULL doesn't help. In your second query, the date condition is considered part of the LEFT JOIN condition, meaning that if no v rows pass the date condition, a null-extended row will be emitted. regards, tom lane