JOIN producing duplicate results - Mailing list pgsql-novice

From Lonni J Friedman
Subject JOIN producing duplicate results
Date
Msg-id CAP=oouHhqwj6j2tDz_YQ5RaJVsRopPtm6Dy9cFxYy8RxR_OGFg@mail.gmail.com
Whole thread Raw
Responses Re: JOIN producing duplicate results
Re: JOIN producing duplicate results
List pgsql-novice
I've got a query that is joining data across 4 tables to provide data
based on test results.  The query is working fine, except for the fact
that its returning two identical records for each row of unique data.
 If I throw a DISTINCT in front of the primary key column (a.id) of
one of the tables in the join, that eliminates all the duplicates.
However, I've read (and found) that DISTINCT tends to introduce a
performance hit, so I'm hoping to find a better performing solution,
if possible.  Hopefully I'm just doing something silly with my JOINS
that is easily fixed.  This is on postgresql-9.0.x, and yes I'm aware
that if i upgraded to 9.1.x then I could likely do a 'group by a.id',
but for now I'm stuck on 9.0.x.

Here's the query:
SELECT a.id,a.suiteid,a.testname,date_trunc('second',a.last_update) AS
last_update,regexp_replace(p.relname,E'tests','','g'),o.osname
FROM smoketests AS a, pg_class AS p, smoke AS t, osversmap AS o
WHERE a.osversion=o.osversion AND a.suiteid=t.id AND a.tableoid=p.oid
AND ( a.current_status='FAILED' ) AND ( a.arch='i386' ) AND (
a.os='Darwin' ) AND a.last_update>'2012-05-01 04:00:00' AND
a.last_update<'2012-05-02 14:20:45'
ORDER BY a.id ;

    id    | suiteid |     testname     |     last_update     |
regexp_replace |   osname
----------+---------+------------------+---------------------+----------------+------------
 32549818 |  668232 | bug377064        | 2012-05-01 08:38:07 | smoke
   | OSX-10.7.x
 32549818 |  668232 | bug377064        | 2012-05-01 08:38:07 | smoke
   | OSX-10.7.x
 32549819 |  668232 | funcmem_resize   | 2012-05-01 08:38:07 | smoke
   | OSX-10.7.x
 32549819 |  668232 | funcmem_resize   | 2012-05-01 08:38:07 | smoke
   | OSX-10.7.x
 32549820 |  668232 | leitest | 2012-05-01 08:38:07 | smoke      | OSX-10.7.x
 32549820 |  668232 | leitest | 2012-05-01 08:38:07 | smoke      | OSX-10.7.x

The problem is visible in the id column, where there are two of each
value returned even though a.id is the unique primary key of the smoke
table and doesn't really have duplicates.  The 'smoke' table has a one
to many relationship with the smoketests table, but I'm still rather
confused why I'm getting the duplicates of everything.

pgsql-novice by date:

Previous
From: Josh Kupershmidt
Date:
Subject: Re: PostgresSQL 8.4 - Data output - Newline
Next
From: Binand Sethumadhavan
Date:
Subject: Re: JOIN producing duplicate results