Thread: Unexpected results when joining on date fields
I have two tables, traffic and sales. Each one has a date field and lists the traffic and sales broken down by various parameters (multiple rows for each date). If I run select (select count(*) from traffic) as traffic, (select count(*) from sales) as sales; I get the following 49383;167807 if I run select count(*) from traffic t inner join sales s on t.date = s.date I get 24836841. If I change the join to a left join, right join, full join I get the same number of records. So I created a data table which just has the dates in it and ran this query. select count(d.date) from dates d inner join traffic t on t.date = d.date inner join sales s on s.date = d.date And I get the same number 24836841 Same goes for right joins on the above query. Left joins of course give a different answer as there are more dates in the date table than there are in the other tables. I am a bit perplexed by what is happening here. Cheers
I don't think you understand what JOIN does. Think of it as a double-nested FOR loop: for each record that has the value on the left side of the JOIN, it will match all records on the right side of the JOIN that meet the ON criteria. For example, if I have two tables:
--
Rick Genter
rick.genter@gmail.com
A (i int, j int):
i j
1 1
2 1
3 2
4 2
5 3
6 3
and
B (k int, j int)
k j
10 1
11 1
12 2
13 2
14 3
15 3
Then if I do
SELECT COUNT(*) FROM A JOIN B ON A.j = B.j
I'll get 12. Each record in A matches 2 records in B on the value of j. Study the following transcript:
bash-3.2$ bin/psql -d g2_master
Password:
psql (8.4.4)
Type "help" for help.
g2_master=# CREATE TABLE A (i int, j int);
CREATE TABLE
g2_master=# CREATE TABLE B (k int, j int);
CREATE TABLE
g2_master=# INSERT INTO A VALUES (1, 1), (2, 1), (3, 2), (4, 2), (5, 3), (6, 3);
INSERT 0 6
g2_master=# INSERT INTO B VALUES (10, 1), (11, 1), (12, 2), (13, 2), (14, 3), (15, 3);
INSERT 0 6
g2_master=# SELECT COUNT(*) FROM A JOIN B ON A.j = B.j;
count
-------
12
(1 row)
g2_master=# SELECT * FROM A JOIN B ON A.j = B.j;
i | j | k | j
---+---+----+---
1 | 1 | 10 | 1
1 | 1 | 11 | 1
2 | 1 | 10 | 1
2 | 1 | 11 | 1
3 | 2 | 12 | 2
3 | 2 | 13 | 2
4 | 2 | 12 | 2
4 | 2 | 13 | 2
5 | 3 | 14 | 3
5 | 3 | 15 | 3
6 | 3 | 14 | 3
6 | 3 | 15 | 3
(12 rows)
g2_master=#
On Sun, Jul 10, 2011 at 4:58 PM, Tim Uckun <tim@basediary.com> wrote:
I have two tables, traffic and sales. Each one has a date field and
lists the traffic and sales broken down by various parameters
(multiple rows for each date).
If I run select (select count(*) from traffic) as traffic, (select
count(*) from sales) as sales; I get the following 49383;167807
if I run select count(*) from traffic t inner join sales s on t.date
= s.date I get 24836841.
If I change the join to a left join, right join, full join I get the
same number of records.
So I created a data table which just has the dates in it and ran this query.
select count(d.date) from dates d
inner join traffic t on t.date = d.date
inner join sales s on s.date = d.date
And I get the same number 24836841
Same goes for right joins on the above query. Left joins of course
give a different answer as there are more dates in the date table than
there are in the other tables.
I am a bit perplexed by what is happening here.
Cheers
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Rick Genter
rick.genter@gmail.com
On 11 Jul 2011, at 1:58, Tim Uckun wrote: > I have two tables, traffic and sales. Each one has a date field and > lists the traffic and sales broken down by various parameters > (multiple rows for each date). > > If I run select (select count(*) from traffic) as traffic, (select > count(*) from sales) as sales; I get the following 49383;167807 > > if I run select count(*) from traffic t inner join sales s on t.date > = s.date I get 24836841. Perhaps you also want to filter that join so that you don't get matches for different articles sold at the same date? Justjoining on date doesn't seem to make a whole lot of sense. Alban Hertroys -- The size of a problem often equals the size of an ego. !DSPAM:737,4e1c7f0c12096580658153!