Re: Joining time fields? - Mailing list pgsql-novice

From Bryan Lee Nuse
Subject Re: Joining time fields?
Date
Msg-id 387689AA-524C-4639-9EDB-BED34F25A052@uga.edu
Whole thread Raw
In response to Re: Joining time fields?  ("Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>)
List pgsql-novice
Hello James,


Here's an alternative query that I believe gets what you're asking for.  It is less clever and succinct than the one Oliveiros suggested, but may be easier to scrutinize if you are carrying a lot of additional columns through to the resulting table (you don't need to GROUP BY a bunch of columns, as you did in the query you posted).  Also, I'd suggest saving any WHERE clause for the outer query (no matter what your final code looks like), as that will help in trouble-shooting.  Like Oliveiros's query, this one rests upon a cross join, and will therefore be slow if your tables are huge.

Here's a self-contained test:

CREATE TABLE table_one (
 date_time   TIMESTAMP,
 letter   char(1)
);

CREATE TABLE table_two (
 date_time   TIMESTAMP,
 letter   char(1)
);

INSERT INTO table_one
 VALUES ('2012-06-23 11:43:26', 'A'),
  ('2010-05-16 06:45:20', 'B'),
  ('2000-01-04 15:57:01', 'C'),
  ('1993-10-06 04:36:09', 'D'); 

INSERT INTO table_two
 VALUES ('2000-06-23 11:43:26', 'W'),
  ('1994-05-16 06:45:20', 'X'),
  ('2008-01-04 15:57:01', 'Y'),
  ('1994-05-16 06:45:20', 'Z'); -- duplicate value 



The query.  Two WITH queries, the first to perform the cross join, the second to aggregate the crossed table according to the minimum time interval.  These two queries are then JOINed in the main query:

WITH 
crossed AS (
  SELECT 
    A.date_time AS dt_a,
    A.letter AS let_a,
    B.date_time AS dt_b,
    B.letter AS let_b,
    @(extract(EPOCH FROM A.date_time - B.date_time)) AS interval
    FROM
    table_one A, table_two B
    ),
grouped AS (
  SELECT dt_a, min(interval) AS interval
    FROM crossed
    GROUP BY dt_a
    )
SELECT C.*
  FROM 
  grouped G
  LEFT JOIN
    crossed C
    USING (dt_a, interval)
  ORDER BY dt_a;



Here's the output (recovers the duplicate rows in table_two, as you specified) :

        dt_a         |        dt_b         | interval  
---------------------+---------------------+-----------
 1993-10-06 04:36:09 | 1994-05-16 06:45:20 |  19188551
 1993-10-06 04:36:09 | 1994-05-16 06:45:20 |  19188551
 2000-01-04 15:57:01 | 2000-06-23 11:43:26 |  14759185
 2010-05-16 06:45:20 | 2008-01-04 15:57:01 |  74530099
 2012-06-23 11:43:26 | 2008-01-04 15:57:01 | 140989585
(5 rows)

Hope that is helpful...
Bryan




pgsql-novice by date:

Previous
From: Tom Lane
Date:
Subject: Re: Is there an answer to the Ultimate Question for PostgreSQL?
Next
From: "Oliveiros d'Azevedo Cristina"
Date:
Subject: Re: Joining time fields?