Thread: Join issue on a maximum value

Join issue on a maximum value

From
Heflin
Date:
OK, it's been a while since I've had to do anything remotely complex in 
SQL, so this may just be a pure brain block on my part.

I have 2 tables, auction and image, defined like this:
                                  Table "public.auction"    Column      |  Type   |                            
Modifiers                           
-----------------+---------+-----------------------------------------------------------------auction_id      | integer
|not null default 
 
nextval('public.auction_auction_id_seq'::text)auction_descrip | text    |auction_owner   | text    |
Indexes:   "auction_pkey" primary key, btree (auction_id)

                               Table "public.image"  Column    |  Type   |                          
Modifiers                         
-------------+---------+-------------------------------------------------------------image_id    | integer | not null
default
 
nextval('public.image_image_id_seq'::text)auction_id  | integer | not nullimage_descr | text    |
Indexes:   "image_pkey" primary key, btree (image_id)
Foreign-key constraints:   "$1" FOREIGN KEY (auction_id) REFERENCES auction(auction_id) ON 
UPDATE RESTRICT ON DELETE RESTRICT


Current data in the tables:

play=# select * from auction
play-# ;auction_id | auction_descrip | auction_owner
------------+-----------------+---------------         1 | Mabel Auction 1 | Mabel         2 | Mabel Auction 2 | Mabel
      3 | Mabel Auction 3 | Mabel         4 | Fred Auction 1  | Fred         5 | Fred Auction 2  | Fred
 


play=# select * from image;image_id | auction_id | image_descr
----------+------------+-------------       1 |          1 | image 1       2 |          1 | image 2       3 |
2| image 3       4 |          3 | image 4       5 |          3 | image 5       6 |          4 | image 7       7 |
  3 | image 8
 

So a basic JOIN gets this:

SELECT auction.auction_id, image.image_id, image.image_descr
FROM auction JOIN image ON auction.auction_id = image.auction_id
WHERE auction.auction_owner = 'Mabel';
auction_id | image_id | image_descr
------------+----------+-------------         1 |        1 | image 1         1 |        2 | image 2         2 |
3| image 3         3 |        4 | image 4         3 |        5 | image 5         3 |        7 | image 8
 
(6 rows)

Now the problem: I can't seem to remember how to get only the max value 
for the image_id for each auction_id so that the result set would be:
auction_id | image_id | image_descr
------------+----------+-------------         1 |        2 | image 2         2 |        3 | image 3         3 |
7| image 8
 

Playing with the max() aggregate seems to be the correct path, but for 
the life of me I can't seem to get the syntax to the point that it 
produces what I need. Any help would be greatly appreciated!

Thanks,

-Heflin




Re: Join issue on a maximum value

From
Bruno Wolff III
Date:
On Wed, Apr 21, 2004 at 14:29:34 -0400, Heflin <hhogan@tampabay.rr.com> wrote:
> 
> So a basic JOIN gets this:
> 
> SELECT auction.auction_id, image.image_id, image.image_descr
> FROM auction JOIN image ON auction.auction_id = image.auction_id
> WHERE auction.auction_owner = 'Mabel';
> 
> Now the problem: I can't seem to remember how to get only the max value 
> for the image_id for each auction_id so that the result set would be:

The postgres specific way of doing this is:
SELECT DISTINCT ON (auction.auction_id)   auction.auction_id, image.image_id, image.image_descr FROM auction JOIN image
ONauction.auction_id = image.auction_id WHERE auction.auction_owner = 'Mabel' ORDER BY auction.auction_id,
image.image_idDESC
 
;

The more standard way to do it would be joining auction and image
with a group by and max to get the highest image_id and then joining
that result to image again to get the corresponding description.


Re: Join issue on a maximum value

From
Edmund Bacon
Date:
Two choices that work:

Either add another JOIN in which retrieves the MAX(image_id) for each 
auction:

SELECT auction.auction_id, image.image_id, image.image_descr   FROM auction   JOIN image USING(auction_id)   JOIN (
SELECTauction_id, MAX(image_id) AS image_id            FROM image           GROUP BY auction_id) max_aid USING
(image_id)  WHERE owner = 'Mabel'   ORDER by auction.auction_id;
 

OR use a sub-select:

SELECT auction.auction_id, image_id, image.image_descr  FROM auction  JOIN image USING (auction_id) WHERE image_id = (
SELECTmax(image_id)                       FROM image                     WHERE auction_id = auction.auction_id)    AND
image_owner= 'Mabel';
 

Test both with your data - My experience is that the sub-select runs 
slower than throwing in the extra join.


Heflin wrote:

> OK, it's been a while since I've had to do anything remotely complex 
> in SQL, so this may just be a pure brain block on my part.
>
> I have 2 tables, auction and image, defined like this:
>
>                                   Table "public.auction"
>     Column      |  Type   |                            
> Modifiers                           
> -----------------+---------+----------------------------------------------------------------- 
>
> auction_id      | integer | not null default 
> nextval('public.auction_auction_id_seq'::text)
> auction_descrip | text    |
> auction_owner   | text    |
> Indexes:
>    "auction_pkey" primary key, btree (auction_id)
>
>
>                                Table "public.image"
>   Column    |  Type   |                          
> Modifiers                         
> -------------+---------+------------------------------------------------------------- 
>
> image_id    | integer | not null default 
> nextval('public.image_image_id_seq'::text)
> auction_id  | integer | not null
> image_descr | text    |
> Indexes:
>    "image_pkey" primary key, btree (image_id)
> Foreign-key constraints:
>    "$1" FOREIGN KEY (auction_id) REFERENCES auction(auction_id) ON 
> UPDATE RESTRICT ON DELETE RESTRICT
>
>
> Current data in the tables:
>
> play=# select * from auction
> play-# ;
> auction_id | auction_descrip | auction_owner
> ------------+-----------------+---------------
>          1 | Mabel Auction 1 | Mabel
>          2 | Mabel Auction 2 | Mabel
>          3 | Mabel Auction 3 | Mabel
>          4 | Fred Auction 1  | Fred
>          5 | Fred Auction 2  | Fred
>
>
> play=# select * from image;
> image_id | auction_id | image_descr
> ----------+------------+-------------
>        1 |          1 | image 1
>        2 |          1 | image 2
>        3 |          2 | image 3
>        4 |          3 | image 4
>        5 |          3 | image 5
>        6 |          4 | image 7
>        7 |          3 | image 8
>
> So a basic JOIN gets this:
>
> SELECT auction.auction_id, image.image_id, image.image_descr
> FROM auction JOIN image ON auction.auction_id = image.auction_id
> WHERE auction.auction_owner = 'Mabel';
>
> auction_id | image_id | image_descr
> ------------+----------+-------------
>          1 |        1 | image 1
>          1 |        2 | image 2
>          2 |        3 | image 3
>          3 |        4 | image 4
>          3 |        5 | image 5
>          3 |        7 | image 8
> (6 rows)
>
> Now the problem: I can't seem to remember how to get only the max 
> value for the image_id for each auction_id so that the result set 
> would be:
>
> auction_id | image_id | image_descr
> ------------+----------+-------------
>          1 |        2 | image 2
>          2 |        3 | image 3
>          3 |        7 | image 8
>
> Playing with the max() aggregate seems to be the correct path, but for 
> the life of me I can't seem to get the syntax to the point that it 
> produces what I need. Any help would be greatly appreciated!
>
> Thanks,
>
> -Heflin
>
>
>

-- 
Edmund Bacon <ebacon@onesystem.com>



Re: Join issue on a maximum value

From
Heflin
Date:
Bruno Wolff III wrote:

>On Wed, Apr 21, 2004 at 14:29:34 -0400,
>  Heflin <hhogan@tampabay.rr.com> wrote:
>  
>
>>So a basic JOIN gets this:
>>
>>SELECT auction.auction_id, image.image_id, image.image_descr
>>FROM auction JOIN image ON auction.auction_id = image.auction_id
>>WHERE auction.auction_owner = 'Mabel';
>>
>>Now the problem: I can't seem to remember how to get only the max value 
>>for the image_id for each auction_id so that the result set would be:
>>    
>>
>
>The postgres specific way of doing this is:
>SELECT DISTINCT ON (auction.auction_id)
>    auction.auction_id, image.image_id, image.image_descr
>  FROM auction JOIN image ON auction.auction_id = image.auction_id
>  WHERE auction.auction_owner = 'Mabel'
>  ORDER BY auction.auction_id, image.image_id DESC
>;
>
>The more standard way to do it would be joining auction and image
>with a group by and max to get the highest image_id and then joining
>that result to image again to get the corresponding description.
>
>  
>
Thanks!

I was actually trying to do it the more standard way, but I've been 
bungling up the syntax. I'm going to play with that some more, since it 
might be useful elsewhere.

The thing that disturbs me about your syntax is that I don't really see 
an assurance that I'll get the correct  image_id. Any chance you can 
tell me why this works?

Thanks again,

-Heflin



Re: Join issue on a maximum value

From
Bruno Wolff III
Date:
On Wed, Apr 21, 2004 at 16:28:10 -0400, Heflin <hhogan@tampabay.rr.com> wrote:
> >
> >The postgres specific way of doing this is:
> >SELECT DISTINCT ON (auction.auction_id)
> >   auction.auction_id, image.image_id, image.image_descr
> > FROM auction JOIN image ON auction.auction_id = image.auction_id
> > WHERE auction.auction_owner = 'Mabel'
> > ORDER BY auction.auction_id, image.image_id DESC
> >;
> 
> The thing that disturbs me about your syntax is that I don't really see 
> an assurance that I'll get the correct  image_id. Any chance you can 
> tell me why this works?

The postgres semantic is that when dinstinct on is combined with order by
the first distinct row in the order defined by the order by is the one
returned. This is described in the documentation and there is an example
of a query taking advantage of this.


Re: Join issue on a maximum value

From
Tom Lane
Date:
Heflin <hhogan@tampabay.rr.com> writes:
> Bruno Wolff III wrote:
>> The postgres specific way of doing this is:
>> SELECT DISTINCT ON (auction.auction_id)
>> auction.auction_id, image.image_id, image.image_descr
>> FROM auction JOIN image ON auction.auction_id = image.auction_id
>> WHERE auction.auction_owner = 'Mabel'
>> ORDER BY auction.auction_id, image.image_id DESC

> The thing that disturbs me about your syntax is that I don't really see 
> an assurance that I'll get the correct  image_id. Any chance you can 
> tell me why this works?

The ORDER BY DESC is what forces the max image_id to be selected.  Read
the discussion of SELECT DISTINCT ON in the SELECT reference page; the
"weather report" example may be illuminating.
        regards, tom lane


Re: Join issue on a maximum value

From
"Atesz"
Date:
On Wed, Apr 21, 2004 at 14:29:34 -0400, Heflin <hhogan ( at ) tampabay ( dot ) rr ( dot ) com> wrote:
> 
> SELECT auction.auction_id, image.image_id, image.image_descr FROM 
> auction JOIN image ON auction.auction_id = image.auction_id WHERE 
> auction.auction_owner = 'Mabel';
> 

In my opinion there are 2 problem: how can you make the query and how
many rows is in the result (performace)? Usually when you have more rows
in the result you can use the LIMIT and OFFSET. So you can reach the
result to unfold more pages. So I bult in these LIMIT and OFFSET into
the queries.

2 new possibilities:
 SELECT IDSEL.*, (SELECT image_descr FROM image WHERE
IDSEL.image_id=image.image_id) FROM (   SELECT auction.auction_id, max(image.image_id) AS image_id  FROM
auction JOIN image USING (auction_id)   WHERE auction_owner = 'Mabel'   GROUP BY auction.auction_id   ORDER BY
auction.auction_id  LIMIT 10 OFFSET 0 ) AS IDSEL;
 

or
 SELECT DISTINCT ON (image.auction_id) image.auction_id,
image.image_id, image.image_descr  FROM auction    JOIN image USING (auction_id)   WHERE auction.auction_owner =
'Mabel'  ORDER BY image.auction_id, (-image.image_id) LIMIT 10 OFFSET 0;
 

Index suggestions:  CREATE INDEX auction_auction_owner on auction(auction_owner); CREATE INDEX image_auction_id on
image(auction_id);CREATE INDEX image_auction_id_neg_image_id on image(auction_id,
 
(-image_id));  -- Specially for the second solution

The second solution build on Bruno Wolff III's ideas:
http://archives.postgresql.org/pgsql-sql/2004-04/msg00211.php and
http://archives.postgresql.org/pgsql-sql/2004-04/msg00262.php .

You can see more solutions for your problem. You have  to select the
best performance solution for your specific databse. Use the EXPLAIN!

Regards, Antal Attila





Re: Join issue on a maximum value

From
Jeremy Semeiks
Date:
On Wed, Apr 21, 2004 at 02:29:34PM -0400, Heflin wrote:
> OK, it's been a while since I've had to do anything remotely complex in 
> SQL, so this may just be a pure brain block on my part.
> 
...
>
> So a basic JOIN gets this:
> 
> SELECT auction.auction_id, image.image_id, image.image_descr
> FROM auction JOIN image ON auction.auction_id = image.auction_id
> WHERE auction.auction_owner = 'Mabel';
> 
> auction_id | image_id | image_descr
> ------------+----------+-------------
>          1 |        1 | image 1
>          1 |        2 | image 2
>          2 |        3 | image 3
>          3 |        4 | image 4
>          3 |        5 | image 5
>          3 |        7 | image 8
> (6 rows)
> 
> Now the problem: I can't seem to remember how to get only the max value 
> for the image_id for each auction_id so that the result set would be:
> 
> auction_id | image_id | image_descr
> ------------+----------+-------------
>          1 |        2 | image 2
>          2 |        3 | image 3
>          3 |        7 | image 8
> 
> Playing with the max() aggregate seems to be the correct path, but for 
> the life of me I can't seem to get the syntax to the point that it 
> produces what I need. Any help would be greatly appreciated!

A simple way to write this is to use DISTINCT ON combined with ORDER
BY. In this case, these clauses can substitute for the MAX aggregate:

select distinct on (auction.auction_id) auction.auction_id, image.image_id,image.image_descr
from auction join image using (auction_id)
where auction.auction_owner = 'Mabel'
order by auction.auction_id, image.image_id desc;

(I haven't tested this.)

I think you need a subselect in there if you want to use the MAX
aggregate.

- Jeremy