Thread: left join is strange

left join is strange

From
Andrei Ivanov
Date:
Hello,
I have 2 tables:

CREATE TABLE products (
        id              SERIAL PRIMARY KEY,
        name            VARCHAR(255) NOT NULL
);

CREATE TABLE products_daily_compacted_views (
        product         INTEGER NOT NULL REFERENCES products,
        date            DATE NOT NULL DEFAULT ('NOW'::TEXT)::DATE,
        count           INTEGER NOT NULL
);

The table products has 1785 rows, the table products_daily_compacted_views
has 768 rows with date = current_date;

I want to list all the products and the number of times each product has
been viewed:

SELECT p.id, p.name, COALESCE(v.count, 0) AS views
FROM products p LEFT JOIN products_daily_compacted_views v ON p.id = v.product
WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC

The problem with this query is that it doesn't return all the products,
instead of 1785 rows, it returns 1077 rows

This modified query seems to be correct, it returns all the products...

SELECT p.id, p.name, COALESCE(v.count, 0) AS views
FROM products p LEFT JOIN products_daily_compacted_views v
ON p.id = v.product AND v.date = current_date
ORDER BY views DESC

Could anybody explain to me why does this happen ?

Thank you.

Re: left join is strange

From
"Arjen van der Meijden"
Date:
> Andrei Ivanov wrote:
>
> I want to list all the products and the number of times each
> product has
> been viewed:
>
> SELECT p.id, p.name, COALESCE(v.count, 0) AS views
> FROM products p LEFT JOIN products_daily_compacted_views v ON
> p.id = v.product
> WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC
>
> The problem with this query is that it doesn't return all the
> products,
> instead of 1785 rows, it returns 1077 rows
And that is exactly as it should be.
You will get the left joined combination of p and v, but the filter in
the where is applied afterwards on all those combinations.

>
> This modified query seems to be correct, it returns all the
> products...
>
> SELECT p.id, p.name, COALESCE(v.count, 0) AS views
> FROM products p LEFT JOIN products_daily_compacted_views v
> ON p.id = v.product AND v.date = current_date
> ORDER BY views DESC
>
> Could anybody explain to me why does this happen ?
Here you apply your filter to the elements of v, before joining them to
the elements of p.

Best regards,

Arjen




Re: left join is strange

From
Andrei Ivanov
Date:

On Mon, 8 Dec 2003, Arjen van der Meijden wrote:

> > Andrei Ivanov wrote:
> >
> > I want to list all the products and the number of times each
> > product has
> > been viewed:
> >
> > SELECT p.id, p.name, COALESCE(v.count, 0) AS views
> > FROM products p LEFT JOIN products_daily_compacted_views v ON
> > p.id = v.product
> > WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC
> >
> > The problem with this query is that it doesn't return all the
> > products,
> > instead of 1785 rows, it returns 1077 rows
> And that is exactly as it should be.
> You will get the left joined combination of p and v, but the filter in
> the where is applied afterwards on all those combinations.
>

I kinda figured that out, but still, being a left join, it should return
all the rows in the table products, which I then filter with
v.date = current_date OR v.date IS NULL.

v.date has 3 possible values: current_date, some other date or NULL, if
there is no corresponding row in products_daily_compacted_views for that
product.

I filter out only 1 value, and I still should get 1785 rows...


> >
> > This modified query seems to be correct, it returns all the
> > products...
> >
> > SELECT p.id, p.name, COALESCE(v.count, 0) AS views
> > FROM products p LEFT JOIN products_daily_compacted_views v
> > ON p.id = v.product AND v.date = current_date
> > ORDER BY views DESC
> >
> > Could anybody explain to me why does this happen ?
> Here you apply your filter to the elements of v, before joining them to
> the elements of p.
>
> Best regards,
>
> Arjen
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>

Re: left join is strange

From
"Arjen van der Meijden"
Date:
> Andrei Ivanov wrote:
>
> On Mon, 8 Dec 2003, Arjen van der Meijden wrote:
>
> > > Andrei Ivanov wrote:
> > >
> > > I want to list all the products and the number of times each
> > > product has
> > > been viewed:
> > >
> > > SELECT p.id, p.name, COALESCE(v.count, 0) AS views
> > > FROM products p LEFT JOIN products_daily_compacted_views v ON
> > > p.id = v.product
> > > WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC
> > >
> > > The problem with this query is that it doesn't return all the
> > > products,
> > > instead of 1785 rows, it returns 1077 rows
> > And that is exactly as it should be.
> > You will get the left joined combination of p and v, but
> the filter in
> > the where is applied afterwards on all those combinations.
> >
>
> I kinda figured that out, but still, being a left join, it
> should return
> all the rows in the table products, which I then filter with
> v.date = current_date OR v.date IS NULL.
>
> v.date has 3 possible values: current_date, some other date
> or NULL, if
> there is no corresponding row in
> products_daily_compacted_views for that
> product.
>
> I filter out only 1 value, and I still should get 1785 rows...

No, you combine two table using a left join (and yes, you get 1785 rows
from that left join), which then (after the joining) get filtered using
your where.
The values that have the current_date (which are probably none, since
that is taken at the moment of the selection, not at the moment of the
insert) or the NULL will get through, resulting in less than your 1785
rows.

Regards,

Arjen




Re: left join is strange

From
Tom Lane
Date:
Andrei Ivanov <andrei.ivanov@ines.ro> writes:
> I kinda figured that out, but still, being a left join, it should return
> all the rows in the table products, which I then filter with
> v.date = current_date OR v.date IS NULL.
> v.date has 3 possible values: current_date, some other date or NULL, if
> there is no corresponding row in products_daily_compacted_views for that
> product.

Right.  Your first query will show products for which (1) there is a v
row with date = current_date, or (2) there is *no* v row at all.  If
there is a v row with the wrong date, it will get through the left join
and then be eliminated at WHERE.  Because it gets through the left join,
no null-extended row is generated for that product, and so your OR
v.date IS NULL doesn't help.

In your second query, the date condition is considered part of the LEFT
JOIN condition, meaning that if no v rows pass the date condition, a
null-extended row will be emitted.

            regards, tom lane