Thread: Having difficulty writing a "best-fit" query..
Hi list,<br /><br />I have a many-to-many relationship between movies and genres and, in the link table I have a third fieldcalled which orders the "appropriateness" of the relations within each movie. <br /><br />For example: <br /><br />movie_id,genre_id, relevance (i've used movie/genre titles for clarity here, but in reality they're id's)<br />--------------------------------------------<br/>beverly hills cop, action, 2<br />beverly hills cop, comedy, 1 <br />theheartbreak kid, comedy, 2<br />the heartbreak kid, romance, 1<br /><br />The above would mean, to my application:<br/>"Beverly Hills Cop is both an Action and a Comedy movie but primarily an Action movie."<br /> "The HeartbreakKid is both a Comedy and a Romance movie but primarily a Comedy movie."<br /><br />First of all, if there's a betterway to model this kind of ranking/ordering of many-to-many relationships, please let me know.<br /><br />Now, to myproblem..<br /><br />I'm taking a subset of all my genres, and I want to get ONE row for each movie in the subset alongsideits most appropriate genre (whichever has the highest relevance). In other words, the best fit. <br /><br />I'vehad a few goes at writing this query but I can't seem to get it right.. The theory in my mind is to:<br /><br />1.filter the link table down to rows that fit the subset of categories (easy)<br /><br />2. filter the link table furtherto keep only the max(relevance) for each movie_id .. this is where i'm having trouble. <br /><br />If someone canshed some light on this for me, I'd really appreciate it.<br /><br />Thanks for your time,<br />Jamie<br />
On 10/16/07, Jamie Tufnell <diesql@googlemail.com> wrote: > I'm taking a subset of all my genres, and I want to get ONE row for each > movie in the subset alongside its most appropriate genre (whichever has the > highest relevance). In other words, the best fit. You didn't provide the expected output, but try: SELECT * FROM t ttWHERE relevance = (SELECT MAX(relevance) FROM t WHERE movie_id =tt.movie_id)
--- Jamie Tufnell <diesql@googlemail.com> wrote: > movie_id, genre_id, relevance (i've used movie/genre titles for clarity > here, but in reality they're id's) > -------------------------------------------- > beverly hills cop, action, 2 > beverly hills cop, comedy, 1 > the heartbreak kid, comedy, 2 > the heartbreak kid, romance, 1 > First of all, if there's a better way to model this kind of ranking/ordering > of many-to-many relationships, please let me know. Joe Celko had an example like this in his book: 17: EMPLOYMENT AGENCY PUZZLE http://www.elsevier.com/wps/find/bookdescription.cws_home/710075/description#description the only difference was that he was modeling employees and skillsets. IIRC, the terminology for the improved model was the "full disjuctive" model. Regards, Richard Broersma Jr.
On Oct 16, 2007, at 12:14 , Richard Broersma Jr wrote: > the only difference was that he was modeling employees and > skillsets. IIRC, the terminology for > the improved model was the "full disjuctive" model. Off chance, might the full disjunction work be relevant here? http://pgfoundry.org/projects/fulldisjunction/ Michael Glaesemann grzm seespotcode net
In article <b0a4f3350710160946y4a31a227sbffbf085cdbfa487@mail.gmail.com>, "Jamie Tufnell" <diesql@googlemail.com> writes: > Hi list, > I have a many-to-many relationship between movies and genres and, in the link > table I have a third field called which orders the "appropriateness" of the > relations within each movie. > For example: > movie_id, genre_id, relevance (i've used movie/genre titles for clarity here, > but in reality they're id's) > -------------------------------------------- > beverly hills cop, action, 2 > beverly hills cop, comedy, 1 > the heartbreak kid, comedy, 2 > the heartbreak kid, romance, 1 > The above would mean, to my application: > "Beverly Hills Cop is both an Action and a Comedy movie but primarily an Action > movie." > "The Heartbreak Kid is both a Comedy and a Romance movie but primarily a Comedy > movie." > First of all, if there's a better way to model this kind of ranking/ordering of > many-to-many relationships, please let me know. This looks fine to me. > Now, to my problem.. > I'm taking a subset of all my genres, and I want to get ONE row for each movie > in the subset alongside its most appropriate genre (whichever has the highest > relevance). In other words, the best fit. You could use something like that: SELECT m.name, g.name, mg.relevance FROM movies m JOIN mg ON mg.movie_id = m.id JOIN genres g ON g.id = mg.genre_id LEFTJOIN mg mg1 ON mg1.movie_id = mg.movie_id AND mg1.relevance > mg.relevance WHERE mg1.movie_id IS NULL This means that there must not be an link table entry for the same movie with a higher relevance.