Thread: Join issue on a maximum value
OK, it's been a while since I've had to do anything remotely complex in SQL, so this may just be a pure brain block on my part. I have 2 tables, auction and image, defined like this: Table "public.auction" Column | Type | Modifiers -----------------+---------+-----------------------------------------------------------------auction_id | integer |not null default nextval('public.auction_auction_id_seq'::text)auction_descrip | text |auction_owner | text | Indexes: "auction_pkey" primary key, btree (auction_id) Table "public.image" Column | Type | Modifiers -------------+---------+-------------------------------------------------------------image_id | integer | not null default nextval('public.image_image_id_seq'::text)auction_id | integer | not nullimage_descr | text | Indexes: "image_pkey" primary key, btree (image_id) Foreign-key constraints: "$1" FOREIGN KEY (auction_id) REFERENCES auction(auction_id) ON UPDATE RESTRICT ON DELETE RESTRICT Current data in the tables: play=# select * from auction play-# ;auction_id | auction_descrip | auction_owner ------------+-----------------+--------------- 1 | Mabel Auction 1 | Mabel 2 | Mabel Auction 2 | Mabel 3 | Mabel Auction 3 | Mabel 4 | Fred Auction 1 | Fred 5 | Fred Auction 2 | Fred play=# select * from image;image_id | auction_id | image_descr ----------+------------+------------- 1 | 1 | image 1 2 | 1 | image 2 3 | 2| image 3 4 | 3 | image 4 5 | 3 | image 5 6 | 4 | image 7 7 | 3 | image 8 So a basic JOIN gets this: SELECT auction.auction_id, image.image_id, image.image_descr FROM auction JOIN image ON auction.auction_id = image.auction_id WHERE auction.auction_owner = 'Mabel'; auction_id | image_id | image_descr ------------+----------+------------- 1 | 1 | image 1 1 | 2 | image 2 2 | 3| image 3 3 | 4 | image 4 3 | 5 | image 5 3 | 7 | image 8 (6 rows) Now the problem: I can't seem to remember how to get only the max value for the image_id for each auction_id so that the result set would be: auction_id | image_id | image_descr ------------+----------+------------- 1 | 2 | image 2 2 | 3 | image 3 3 | 7| image 8 Playing with the max() aggregate seems to be the correct path, but for the life of me I can't seem to get the syntax to the point that it produces what I need. Any help would be greatly appreciated! Thanks, -Heflin
On Wed, Apr 21, 2004 at 14:29:34 -0400, Heflin <hhogan@tampabay.rr.com> wrote: > > So a basic JOIN gets this: > > SELECT auction.auction_id, image.image_id, image.image_descr > FROM auction JOIN image ON auction.auction_id = image.auction_id > WHERE auction.auction_owner = 'Mabel'; > > Now the problem: I can't seem to remember how to get only the max value > for the image_id for each auction_id so that the result set would be: The postgres specific way of doing this is: SELECT DISTINCT ON (auction.auction_id) auction.auction_id, image.image_id, image.image_descr FROM auction JOIN image ONauction.auction_id = image.auction_id WHERE auction.auction_owner = 'Mabel' ORDER BY auction.auction_id, image.image_idDESC ; The more standard way to do it would be joining auction and image with a group by and max to get the highest image_id and then joining that result to image again to get the corresponding description.
Two choices that work: Either add another JOIN in which retrieves the MAX(image_id) for each auction: SELECT auction.auction_id, image.image_id, image.image_descr FROM auction JOIN image USING(auction_id) JOIN ( SELECTauction_id, MAX(image_id) AS image_id FROM image GROUP BY auction_id) max_aid USING (image_id) WHERE owner = 'Mabel' ORDER by auction.auction_id; OR use a sub-select: SELECT auction.auction_id, image_id, image.image_descr FROM auction JOIN image USING (auction_id) WHERE image_id = ( SELECTmax(image_id) FROM image WHERE auction_id = auction.auction_id) AND image_owner= 'Mabel'; Test both with your data - My experience is that the sub-select runs slower than throwing in the extra join. Heflin wrote: > OK, it's been a while since I've had to do anything remotely complex > in SQL, so this may just be a pure brain block on my part. > > I have 2 tables, auction and image, defined like this: > > Table "public.auction" > Column | Type | > Modifiers > -----------------+---------+----------------------------------------------------------------- > > auction_id | integer | not null default > nextval('public.auction_auction_id_seq'::text) > auction_descrip | text | > auction_owner | text | > Indexes: > "auction_pkey" primary key, btree (auction_id) > > > Table "public.image" > Column | Type | > Modifiers > -------------+---------+------------------------------------------------------------- > > image_id | integer | not null default > nextval('public.image_image_id_seq'::text) > auction_id | integer | not null > image_descr | text | > Indexes: > "image_pkey" primary key, btree (image_id) > Foreign-key constraints: > "$1" FOREIGN KEY (auction_id) REFERENCES auction(auction_id) ON > UPDATE RESTRICT ON DELETE RESTRICT > > > Current data in the tables: > > play=# select * from auction > play-# ; > auction_id | auction_descrip | auction_owner > ------------+-----------------+--------------- > 1 | Mabel Auction 1 | Mabel > 2 | Mabel Auction 2 | Mabel > 3 | Mabel Auction 3 | Mabel > 4 | Fred Auction 1 | Fred > 5 | Fred Auction 2 | Fred > > > play=# select * from image; > image_id | auction_id | image_descr > ----------+------------+------------- > 1 | 1 | image 1 > 2 | 1 | image 2 > 3 | 2 | image 3 > 4 | 3 | image 4 > 5 | 3 | image 5 > 6 | 4 | image 7 > 7 | 3 | image 8 > > So a basic JOIN gets this: > > SELECT auction.auction_id, image.image_id, image.image_descr > FROM auction JOIN image ON auction.auction_id = image.auction_id > WHERE auction.auction_owner = 'Mabel'; > > auction_id | image_id | image_descr > ------------+----------+------------- > 1 | 1 | image 1 > 1 | 2 | image 2 > 2 | 3 | image 3 > 3 | 4 | image 4 > 3 | 5 | image 5 > 3 | 7 | image 8 > (6 rows) > > Now the problem: I can't seem to remember how to get only the max > value for the image_id for each auction_id so that the result set > would be: > > auction_id | image_id | image_descr > ------------+----------+------------- > 1 | 2 | image 2 > 2 | 3 | image 3 > 3 | 7 | image 8 > > Playing with the max() aggregate seems to be the correct path, but for > the life of me I can't seem to get the syntax to the point that it > produces what I need. Any help would be greatly appreciated! > > Thanks, > > -Heflin > > > -- Edmund Bacon <ebacon@onesystem.com>
Bruno Wolff III wrote: >On Wed, Apr 21, 2004 at 14:29:34 -0400, > Heflin <hhogan@tampabay.rr.com> wrote: > > >>So a basic JOIN gets this: >> >>SELECT auction.auction_id, image.image_id, image.image_descr >>FROM auction JOIN image ON auction.auction_id = image.auction_id >>WHERE auction.auction_owner = 'Mabel'; >> >>Now the problem: I can't seem to remember how to get only the max value >>for the image_id for each auction_id so that the result set would be: >> >> > >The postgres specific way of doing this is: >SELECT DISTINCT ON (auction.auction_id) > auction.auction_id, image.image_id, image.image_descr > FROM auction JOIN image ON auction.auction_id = image.auction_id > WHERE auction.auction_owner = 'Mabel' > ORDER BY auction.auction_id, image.image_id DESC >; > >The more standard way to do it would be joining auction and image >with a group by and max to get the highest image_id and then joining >that result to image again to get the corresponding description. > > > Thanks! I was actually trying to do it the more standard way, but I've been bungling up the syntax. I'm going to play with that some more, since it might be useful elsewhere. The thing that disturbs me about your syntax is that I don't really see an assurance that I'll get the correct image_id. Any chance you can tell me why this works? Thanks again, -Heflin
On Wed, Apr 21, 2004 at 16:28:10 -0400, Heflin <hhogan@tampabay.rr.com> wrote: > > > >The postgres specific way of doing this is: > >SELECT DISTINCT ON (auction.auction_id) > > auction.auction_id, image.image_id, image.image_descr > > FROM auction JOIN image ON auction.auction_id = image.auction_id > > WHERE auction.auction_owner = 'Mabel' > > ORDER BY auction.auction_id, image.image_id DESC > >; > > The thing that disturbs me about your syntax is that I don't really see > an assurance that I'll get the correct image_id. Any chance you can > tell me why this works? The postgres semantic is that when dinstinct on is combined with order by the first distinct row in the order defined by the order by is the one returned. This is described in the documentation and there is an example of a query taking advantage of this.
Heflin <hhogan@tampabay.rr.com> writes: > Bruno Wolff III wrote: >> The postgres specific way of doing this is: >> SELECT DISTINCT ON (auction.auction_id) >> auction.auction_id, image.image_id, image.image_descr >> FROM auction JOIN image ON auction.auction_id = image.auction_id >> WHERE auction.auction_owner = 'Mabel' >> ORDER BY auction.auction_id, image.image_id DESC > The thing that disturbs me about your syntax is that I don't really see > an assurance that I'll get the correct image_id. Any chance you can > tell me why this works? The ORDER BY DESC is what forces the max image_id to be selected. Read the discussion of SELECT DISTINCT ON in the SELECT reference page; the "weather report" example may be illuminating. regards, tom lane
On Wed, Apr 21, 2004 at 14:29:34 -0400, Heflin <hhogan ( at ) tampabay ( dot ) rr ( dot ) com> wrote: > > SELECT auction.auction_id, image.image_id, image.image_descr FROM > auction JOIN image ON auction.auction_id = image.auction_id WHERE > auction.auction_owner = 'Mabel'; > In my opinion there are 2 problem: how can you make the query and how many rows is in the result (performace)? Usually when you have more rows in the result you can use the LIMIT and OFFSET. So you can reach the result to unfold more pages. So I bult in these LIMIT and OFFSET into the queries. 2 new possibilities: SELECT IDSEL.*, (SELECT image_descr FROM image WHERE IDSEL.image_id=image.image_id) FROM ( SELECT auction.auction_id, max(image.image_id) AS image_id FROM auction JOIN image USING (auction_id) WHERE auction_owner = 'Mabel' GROUP BY auction.auction_id ORDER BY auction.auction_id LIMIT 10 OFFSET 0 ) AS IDSEL; or SELECT DISTINCT ON (image.auction_id) image.auction_id, image.image_id, image.image_descr FROM auction JOIN image USING (auction_id) WHERE auction.auction_owner = 'Mabel' ORDER BY image.auction_id, (-image.image_id) LIMIT 10 OFFSET 0; Index suggestions: CREATE INDEX auction_auction_owner on auction(auction_owner); CREATE INDEX image_auction_id on image(auction_id);CREATE INDEX image_auction_id_neg_image_id on image(auction_id, (-image_id)); -- Specially for the second solution The second solution build on Bruno Wolff III's ideas: http://archives.postgresql.org/pgsql-sql/2004-04/msg00211.php and http://archives.postgresql.org/pgsql-sql/2004-04/msg00262.php . You can see more solutions for your problem. You have to select the best performance solution for your specific databse. Use the EXPLAIN! Regards, Antal Attila
On Wed, Apr 21, 2004 at 02:29:34PM -0400, Heflin wrote: > OK, it's been a while since I've had to do anything remotely complex in > SQL, so this may just be a pure brain block on my part. > ... > > So a basic JOIN gets this: > > SELECT auction.auction_id, image.image_id, image.image_descr > FROM auction JOIN image ON auction.auction_id = image.auction_id > WHERE auction.auction_owner = 'Mabel'; > > auction_id | image_id | image_descr > ------------+----------+------------- > 1 | 1 | image 1 > 1 | 2 | image 2 > 2 | 3 | image 3 > 3 | 4 | image 4 > 3 | 5 | image 5 > 3 | 7 | image 8 > (6 rows) > > Now the problem: I can't seem to remember how to get only the max value > for the image_id for each auction_id so that the result set would be: > > auction_id | image_id | image_descr > ------------+----------+------------- > 1 | 2 | image 2 > 2 | 3 | image 3 > 3 | 7 | image 8 > > Playing with the max() aggregate seems to be the correct path, but for > the life of me I can't seem to get the syntax to the point that it > produces what I need. Any help would be greatly appreciated! A simple way to write this is to use DISTINCT ON combined with ORDER BY. In this case, these clauses can substitute for the MAX aggregate: select distinct on (auction.auction_id) auction.auction_id, image.image_id,image.image_descr from auction join image using (auction_id) where auction.auction_owner = 'Mabel' order by auction.auction_id, image.image_id desc; (I haven't tested this.) I think you need a subselect in there if you want to use the MAX aggregate. - Jeremy