Re: Inner join question - Mailing list pgsql-general

From Nick Barr
Subject Re: Inner join question
Date
Msg-id 4034E925.4040006@chuckie.co.uk
Whole thread Raw
In response to Inner join question  (Randall Skelton <skelton@brutus.uwaterloo.ca>)
List pgsql-general
Randall Skelton wrote:

> Greetings all,
>
> I am trying to do what should be a simple join but the tables are
> large and it is taking a long, long time.  I have the feeling that I
> have stuffed up something in the syntax.
>
> Here is what I have:
>
> telemetry=> select (tq1.timestamp = tq2.timestamp) as timestamp,
> tq1.value as q1, tq2.value as q2 from cal_quat_1 tq1 inner join
> cal_quat_2 as tq2 using (timestamp) where timestamp > '2004-01-12
> 09:47:56.0000 +0' and timestamp < '2004-01-12 09:50:44.7187 +0' order
> by timestamp;
>
> telemetry=> \d cal_quat_1
>                 Table "cal_quat_1"
>   Column   |           Type           | Modifiers
> -----------+--------------------------+-----------
>  timestamp | timestamp with time zone |
>  value     | double precision         |
>
> telemetry=> \d cal_quat_2
>                 Table "cal_quat_2"
>   Column   |           Type           | Modifiers
> -----------+--------------------------+-----------
>  timestamp | timestamp with time zone |
>  value     | double precision         |
>
> My understanding of an inner join is that the query above will
> restrict to finding tq1.timestamp, tq1.value and then move onto
> t12.value to search the subset.  I have tried this with and without
> the '=' sign and it isn't clear if it is making any difference at all
> (the timestamps are identical in the range of interest).  I have not
> allowed the query to finish as it seems to take more than 10 minutes.
> Both timestamps are indexed and I expect about 150 rows to be
> returned.  At the end of the day, I have four identical tables of
> quaternions (timestamp, value) and I need to extract them all for a
> range of timestamps.
>
> Cheers,
> Randall
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

We need more information to be able to help further. Can you supply:

1. Total number of rows in each table.
2. Results from "explain analyze <your query>"
3. key configuration values from postgresql.conf
4. Basic hardware config. (CPU type and number, Total RAM, HDD type,
size and speed)

But in the mean time can you try the following query instead.

select (tq1.timestamp = tq2.timestamp) as timestamp, tq1.value as q1,
tq2.value as q2 from cal_quat_1 tq1, cal_quat_2 as tq2 WHERE
tq1.timestamp=tq2.timestamp AND tq1.timestamp BETWEEN '2004-01-12
09:47:56.0000 +0'::timestamp AND '2004-01-12 09:50:44.7187
+0'::timestamp order by tq1.timestamp;

As far as I know, and someone please correct me, this allows the planner
the most flexibility when figuring out the optimum plan.


Thanks

Nick



pgsql-general by date:

Previous
From: Lincoln Yeoh
Date:
Subject: Re: Big Tables vs. many Tables vs. many Databases
Next
From: Jan Poslusny
Date:
Subject: Re: Inner join question