Thread: Unexpected results when joining on date fields

Unexpected results when joining on date fields

From
Tim Uckun
Date:
I have two tables, traffic and sales. Each one has a date field and
lists the traffic and sales broken down by various parameters
(multiple rows for each date).

If I run  select (select count(*) from traffic) as traffic, (select
count(*) from sales) as sales; I get the following  49383;167807

if I run   select count(*) from traffic t inner join sales s on t.date
= s.date  I get 24836841.

If I change the join to a left join, right join, full join I get the
same number of records.

So I created a data table which just has the dates in it and ran this query.

select count(d.date) from dates d
inner join traffic t on t.date = d.date
inner join sales s on s.date = d.date

And I get the same number 24836841

Same goes for right joins on the above query. Left joins of course
give a different answer as there are more dates in the date table than
there are in the other tables.

I am a bit perplexed by what is happening here.

Cheers

Re: Unexpected results when joining on date fields

From
Rick Genter
Date:
I don't think you understand what JOIN does. Think of it as a double-nested FOR loop: for each record that has the value on the left side of the JOIN, it will match all records on the right side of the JOIN that meet the ON criteria. For example, if I have two tables:

A (i int, j int):

i       j
1     1
2     1
3     2
4     2
5     3
6     3


and

B (k int, j int)
k     j
10   1
11   1
12   2
13   2
14   3
15   3

Then if I do

SELECT COUNT(*) FROM A JOIN B ON A.j = B.j

I'll get 12. Each record in A matches 2 records in B on the value of j. Study the following transcript:

bash-3.2$ bin/psql -d g2_master
Password: 
psql (8.4.4)
Type "help" for help.

g2_master=# CREATE TABLE A (i int, j int);
CREATE TABLE
g2_master=# CREATE TABLE B (k int, j int);
CREATE TABLE
g2_master=# INSERT INTO A VALUES (1, 1), (2, 1), (3, 2), (4, 2), (5, 3), (6, 3);
INSERT 0 6
g2_master=# INSERT INTO B VALUES (10, 1), (11, 1), (12, 2), (13, 2), (14, 3), (15, 3);
INSERT 0 6
g2_master=# SELECT COUNT(*) FROM A JOIN B ON A.j = B.j;
 count 
-------
    12
(1 row)

g2_master=# SELECT * FROM A JOIN B ON A.j = B.j;
 i | j | k  | j 
---+---+----+---
 1 | 1 | 10 | 1
 1 | 1 | 11 | 1
 2 | 1 | 10 | 1
 2 | 1 | 11 | 1
 3 | 2 | 12 | 2
 3 | 2 | 13 | 2
 4 | 2 | 12 | 2
 4 | 2 | 13 | 2
 5 | 3 | 14 | 3
 5 | 3 | 15 | 3
 6 | 3 | 14 | 3
 6 | 3 | 15 | 3
(12 rows)

g2_master=# 


On Sun, Jul 10, 2011 at 4:58 PM, Tim Uckun <tim@basediary.com> wrote:
I have two tables, traffic and sales. Each one has a date field and
lists the traffic and sales broken down by various parameters
(multiple rows for each date).

If I run  select (select count(*) from traffic) as traffic, (select
count(*) from sales) as sales; I get the following  49383;167807

if I run   select count(*) from traffic t inner join sales s on t.date
= s.date  I get 24836841.

If I change the join to a left join, right join, full join I get the
same number of records.

So I created a data table which just has the dates in it and ran this query.

select count(d.date) from dates d
inner join traffic t on t.date = d.date
inner join sales s on s.date = d.date

And I get the same number 24836841

Same goes for right joins on the above query. Left joins of course
give a different answer as there are more dates in the date table than
there are in the other tables.

I am a bit perplexed by what is happening here.

Cheers

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Rick Genter
rick.genter@gmail.com

Re: Unexpected results when joining on date fields

From
Alban Hertroys
Date:
On 11 Jul 2011, at 1:58, Tim Uckun wrote:

> I have two tables, traffic and sales. Each one has a date field and
> lists the traffic and sales broken down by various parameters
> (multiple rows for each date).
>
> If I run  select (select count(*) from traffic) as traffic, (select
> count(*) from sales) as sales; I get the following  49383;167807
>
> if I run   select count(*) from traffic t inner join sales s on t.date
> = s.date  I get 24836841.


Perhaps you also want to filter that join so that you don't get matches for different articles sold at the same date?
Justjoining on date doesn't seem to make a whole lot of sense. 

Alban Hertroys

--
The size of a problem often equals the size of an ego.



!DSPAM:737,4e1c7f0c12096580658153!