Re: Help with a JOIN. - Mailing list pgsql-novice

From Mike Swierczek
Subject Re: Help with a JOIN.
Date
Msg-id 53CFC701.8080200@gmail.com
Whole thread Raw
In response to Re: Help with a JOIN.  (Ken Benson <Ken@infowerks.com>)
List pgsql-novice
Comments below, as normal.

On 07/23/2014 10:11 AM, Ken Benson wrote:
> From: Paul Linehan [mailto:linehanp@tcd.ie]
> Sent: Wednesday, July 23, 2014 2:47 AM
> To: Ken Benson
> Cc: pgsql-novice@postgresql.org
> Subject: Re: [NOVICE] Help with a JOIN.
>
>
> Hi all (again),
>
> Ah... the joys of the shifting spec... I had an issue over the weekend and received assistance from
> the list. I thought the problem was solved, but in fact there was a "modification to requirements" which
> means that I'm back at square one!
>
>
> I have two tables - ticket and comment (DDL and DML below) and I want to get the latest comment on
> a ticket which hasn't been modified for some arbitrary period (in this example, 2 years (60M seconds)).
>
> The result I want is this
>
>  ticket_id | ticket_description | c_max_date |   comments_comment
> -----------+--------------------+------------+-----------------------
>          3 | ticket 3           | 1089386130 | comment 4 on ticket 3
>          4 | ticket 4           | 1151680530 | comment 4 on ticket 4
>
>
> but on running the query
>
>
> SELECT t.ticket_id, t.ticket_description, x.c_max_date, z.comments_comment
> FROM
>  (
>    SELECT ticket_id, MAX(c.comments_timestamp) AS c_max_date
>    FROM comment c
>    WHERE c.comments_timestamp < extract(epoch FROM TIMESTAMP '2014-07-20') - 60000000
>    GROUP BY ticket_id
>  )

I suspect part of the problem, or a related problem, is that your
timestamp restriction is inside your inner select.   Say you have a
ticket 6 with comments in 2013, 2011, and 2010.  Your inner select will
return a row 6, 2011 for that ticket.  I don't think that's what you
want, from your description you only want tickets that don't have
comments in the last two years, not the most recent comment that's at
least two years old for each ticket.

I believe instead you want your inner select to be:
(
  SELECT ticket_id, MAX(c.comments_timestamp) AS c_max_date
  FROM comment c
  GROUP BY ticket_id
  HAVING MAX(c.comments_timestamp) < extract(epoch FROM TIMESTAMP
'2014-07-20') - 60000000
)

Good luck,
Mike

> AS x INNER JOIN ticket AS t ON t.ticket_id = x.ticket_id
> INNER JOIN comment AS z ON z.comments_timestamp = x.c_max_date
> [[KenB]] AND x.ticket_id=z.ticket_id
> ;
>
>
> I'm getting this
>
>  ticket_id | ticket_description | c_max_date |   comments_comment
> -----------+--------------------+------------+-----------------------
>          3 | ticket 3           | 1089386130 | comment 3 on ticket 4  <<=== matches a timestamp for ticket 4
>          3 | ticket 3           | 1089386130 | comment 4 on ticket 3
>          4 | ticket 4           | 1151680530 | comment 4 on ticket 4
>
>
>
> This is because when I run this query
>
>
> select ticket_id, comment_id, comments_timestamp
> from comment where ticket_id IN (3, 4)
> order by ticket_id, comment_id, comments_timestamp;
>
> I see the data has the following
>
>  ticket_id | comment_id | comments_timestamp
> -----------+------------+--------------------
>          3 |          9 |         1007133330
>          3 |         10 |         1013613330
>          3 |         11 |         1088608530
>          3 |         12 |         1089386130 <<-- matches on ticket 4
>          4 |         13 |         1070205330
>          4 |         14 |         1076685330
>          4 |         15 |         1089386130 <<-- matches on ticket 3
>          4 |         16 |         1151680530
>
> So my query is picking up the match between 3 (latest comments_timestamp) and a
> timestamp on ticket 4 which is not the latest comments_timestamp for ticket 4.
>
>
> What I would like is for my SQL to pick up the latest comments_timestamp by each ticket
> and not the match between the latest comment on ticket 3 and one of the timestamps on
> ticket 4.
>
> I did think that I'd be able to use the comment_id field in the comment table, but as
> it turns out, this is *_not_* indicative of the latest comment - comments may be modified
> after they have been entered, so the largest value of comment_id for a given ticket
> is not necessarily the last modification for that ticket.
>
>
> Any assistance/ideas gratefully received.
>
>
> Paul...
>
>
>
> ======== DDL and DML for my problem ====================
>
> CREATE TABLE IF NOT EXISTS ticket
> (
>   ticket_id INT NOT NULL,
>   ticket_description VARCHAR(25),
>   PRIMARY KEY (ticket_id)
> );
>
>
> CREATE TABLE IF NOT EXISTS comment
> (
>   comment_id INT NOT NULL,
>   ticket_id INT NOT NULL,
>   comments_comment VARCHAR(45) NULL,
>   comments_timestamp INT NULL,
>   PRIMARY KEY (comment_id)
> );
>
> INSERT INTO ticket VALUES(1, 'ticket 1');
> INSERT INTO ticket VALUES(2, 'ticket 2');
> INSERT INTO ticket VALUES(3, 'ticket 3');
> INSERT INTO ticket VALUES(4, 'ticket 4');
> INSERT INTO ticket VALUES(5, 'ticket 5');
>
> INSERT INTO comment VALUES(1, 1, 'comment 1 on ticket 1', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
> INSERT INTO comment VALUES(2, 1, 'comment 2 on ticket 1', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
> INSERT INTO comment VALUES(3, 1, 'comment 3 on ticket 1', extract(epoch FROM TIMESTAMP '2014-06-30 15:15:30'));
> INSERT INTO comment VALUES(4, 1, 'comment 4 on ticket 1', extract(epoch FROM TIMESTAMP '2014-07-09 15:15:30'));
> INSERT INTO comment VALUES(5, 2, 'comment 1 on ticket 2', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
> INSERT INTO comment VALUES(6, 2, 'comment 2 on ticket 2', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
> INSERT INTO comment VALUES(7, 2, 'comment 3 on ticket 2', extract(epoch FROM TIMESTAMP '2014-07-30 15:15:30'));
> INSERT INTO comment VALUES(8, 2, 'comment 4 on ticket 2', extract(epoch FROM TIMESTAMP '2014-08-09 15:15:30'));
> INSERT INTO comment VALUES(9, 3, 'comment 1 on ticket 3', extract(epoch FROM TIMESTAMP '2001-11-30 15:15:30'));
> INSERT INTO comment VALUES(10, 3, 'comment 2 on ticket 3', extract(epoch FROM TIMESTAMP '2002-02-13 15:15:30'));
> INSERT INTO comment VALUES(11, 3, 'comment 3 on ticket 3', extract(epoch FROM TIMESTAMP '2004-06-30 15:15:30'));
> INSERT INTO comment VALUES(12, 3, 'comment 4 on ticket 3', extract(epoch FROM TIMESTAMP '2004-07-09 15:15:30'));
> INSERT INTO comment VALUES(13, 4, 'comment 1 on ticket 4', extract(epoch FROM TIMESTAMP '2003-11-30 15:15:30'));
> INSERT INTO comment VALUES(14, 4, 'comment 2 on ticket 4', extract(epoch FROM TIMESTAMP '2004-02-13 15:15:30'));
> INSERT INTO comment VALUES(15, 4, 'comment 3 on ticket 4', extract(epoch FROM TIMESTAMP '2004-07-09 15:15:30'));
> INSERT INTO comment VALUES(16, 4, 'comment 4 on ticket 4', extract(epoch FROM TIMESTAMP '2006-06-30 15:15:30'));
> INSERT INTO comment VALUES(17, 5, 'comment 1 on ticket 1', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
> INSERT INTO comment VALUES(18, 5, 'comment 2 on ticket 2', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
> INSERT INTO comment VALUES(19, 5, 'comment 3 on ticket 3', extract(epoch FROM TIMESTAMP '2014-06-30 15:15:30'));
> INSERT INTO comment VALUES(20, 5, 'comment 4 on ticket 4', extract(epoch FROM TIMESTAMP '2014-07-09 15:15:30'));
>
>
>
> 2014-07-20 17:51 GMT+01:00 Paul Linehan <linehanp@tcd.ie<mailto:linehanp@tcd.ie>>:
>
> [[KenB]]  You're asking for the MAX(comments_comment) …
> <rest snipped>
>
> Thanks for that Ken - just another small question for the list though.
> Your query works perfectly for the data that I posted, however, I have data which has
> duplicated dates (see DDL and DML below).
>
> With this data, the result of your query
>
> SELECT t.ticket_id, t.ticket_description, x.c_max_date, z.comments_comment
> FROM
>  (
>    SELECT ticket_id, MAX(c.comments_timestamp) AS c_max_date
>    FROM comment c
>    WHERE c.comments_timestamp < extract(epoch FROM TIMESTAMP '2014-07-20') - 60000000
>    GROUP BY ticket_id
>  )
> AS x INNER JOIN ticket AS t ON t.ticket_id = x.ticket_id
> INNER JOIN comment AS z ON z.comments_timestamp = x.c_max_date;
> is
>
>  ticket_id | ticket_description | c_max_date |   comments_comment
> -----------+--------------------+------------+-----------------------
>          3 | ticket 3           | 1089386130 | comment 3 on ticket 4  <<=== matches a timestamp for ticket 4
>          3 | ticket 3           | 1089386130 | comment 4 on ticket 3
>          4 | ticket 4           | 1151680530 | comment 4 on ticket 4
>
> The result I want is
>
>  ticket_id | ticket_description | c_max_date |   comments_comment
> -----------+--------------------+------------+-----------------------
>          3 | ticket 3           | 1089386130 | comment 4 on ticket 3
>          4 | ticket 4           | 1151680530 | comment 4 on ticket 4
>
>
> Running this query on the comments table
> select ticket_id, comment_id, comments_timestamp
> from comment where ticket_id IN (3, 4)
> order by ticket_id, comment_id, comments_timestamp;
>
> gives
>
>  ticket_id | comment_id | comments_timestamp
> -----------+------------+--------------------
>          3 |          9 |         1007133330
>          3 |         10 |         1013613330
>          3 |         11 |         1088608530
>          3 |         12 |         1089386130 <<-- matches on ticket 4
>          4 |         13 |         1070205330
>          4 |         14 |         1076685330
>          4 |         15 |         1089386130 <<-- matches on ticket 3
>          4 |         16 |         1151680530
>
> The problem is that the last date for ticket 3 matches a date (not the last)
> for ticket 4.
> Is there any way that I can pick up only the latest ticket in this case - you may
> assume that comment_id is some sort of auto incrementing field - but I can't include
> comment_id in my x subquery. I've been experimenting, but to no avail.
>
> TIA and rgs,
> Paul...
>
> ======== DDL and DML for my problem ====================
>
> CREATE TABLE IF NOT EXISTS ticket
> (
>   ticket_id INT NOT NULL,
>   ticket_description VARCHAR(25),
>   PRIMARY KEY (ticket_id)
> );
>
>
> CREATE TABLE IF NOT EXISTS comment
> (
>   comment_id INT NOT NULL,
>   ticket_id INT NOT NULL,
>   comments_comment VARCHAR(45) NULL,
>   comments_timestamp INT NULL,
>   PRIMARY KEY (comment_id)
> );
>
> INSERT INTO ticket VALUES(1, 'ticket 1');
> INSERT INTO ticket VALUES(2, 'ticket 2');
> INSERT INTO ticket VALUES(3, 'ticket 3');
> INSERT INTO ticket VALUES(4, 'ticket 4');
> INSERT INTO ticket VALUES(5, 'ticket 5');
>
> INSERT INTO comment VALUES(1, 1, 'comment 1 on ticket 1', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
> INSERT INTO comment VALUES(2, 1, 'comment 2 on ticket 1', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
> INSERT INTO comment VALUES(3, 1, 'comment 3 on ticket 1', extract(epoch FROM TIMESTAMP '2014-06-30 15:15:30'));
> INSERT INTO comment VALUES(4, 1, 'comment 4 on ticket 1', extract(epoch FROM TIMESTAMP '2014-07-09 15:15:30'));
> INSERT INTO comment VALUES(5, 2, 'comment 1 on ticket 2', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
> INSERT INTO comment VALUES(6, 2, 'comment 2 on ticket 2', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
> INSERT INTO comment VALUES(7, 2, 'comment 3 on ticket 2', extract(epoch FROM TIMESTAMP '2014-07-30 15:15:30'));
> INSERT INTO comment VALUES(8, 2, 'comment 4 on ticket 2', extract(epoch FROM TIMESTAMP '2014-08-09 15:15:30'));
> INSERT INTO comment VALUES(9, 3, 'comment 1 on ticket 3', extract(epoch FROM TIMESTAMP '2001-11-30 15:15:30'));
> INSERT INTO comment VALUES(10, 3, 'comment 2 on ticket 3', extract(epoch FROM TIMESTAMP '2002-02-13 15:15:30'));
> INSERT INTO comment VALUES(11, 3, 'comment 3 on ticket 3', extract(epoch FROM TIMESTAMP '2004-06-30 15:15:30'));
> INSERT INTO comment VALUES(12, 3, 'comment 4 on ticket 3', extract(epoch FROM TIMESTAMP '2004-07-09 15:15:30'));
> INSERT INTO comment VALUES(13, 4, 'comment 1 on ticket 4', extract(epoch FROM TIMESTAMP '2003-11-30 15:15:30'));
> INSERT INTO comment VALUES(14, 4, 'comment 2 on ticket 4', extract(epoch FROM TIMESTAMP '2004-02-13 15:15:30'));
> INSERT INTO comment VALUES(15, 4, 'comment 3 on ticket 4', extract(epoch FROM TIMESTAMP '2004-07-09 15:15:30'));
> INSERT INTO comment VALUES(16, 4, 'comment 4 on ticket 4', extract(epoch FROM TIMESTAMP '2006-06-30 15:15:30'));
> INSERT INTO comment VALUES(17, 5, 'comment 1 on ticket 1', extract(epoch FROM TIMESTAMP '2013-11-30 15:15:30'));
> INSERT INTO comment VALUES(18, 5, 'comment 2 on ticket 2', extract(epoch FROM TIMESTAMP '2014-02-13 15:15:30'));
> INSERT INTO comment VALUES(19, 5, 'comment 3 on ticket 3', extract(epoch FROM TIMESTAMP '2014-06-30 15:15:30'));
> INSERT INTO comment VALUES(20, 5, 'comment 4 on ticket 4', extract(epoch FROM TIMESTAMP '2014-07-09 15:15:30'));
>
>
> --
>
> linehanp@tcd.ie<mailto:linehanp@tcd.ie>
>
> Mob: 00 353 86 864 5772
>
>
>
>
> --
>
> linehanp@tcd.ie<mailto:linehanp@tcd.ie>
>
> Mob: 00 353 86 864 5772
>


pgsql-novice by date:

Previous
From: David G Johnston
Date:
Subject: Re: Help with a JOIN.
Next
From: JORGE MALDONADO
Date:
Subject: Multiple foreign keys to the same table