The goal of my query is: given a book, what did other people who
bought this book also buy? I plan the list the 5 most popular such
books. In reality, this isn't about books, but that makes it easier
to understand I think.
We have a table of customer_id (watch_list_id) and book_id
(element_id).
freshports.org=# \d watch_list_element Table "public.watch_list_element" Column | Type | Modifiers
---------------+---------+-----------watch_list_id | integer | not nullelement_id | integer | not null
Indexes: "watch_list_element_pkey" primary key, btree (watch_list_id,
element_id) "watch_list_element_element_id" btree (element_id)
Foreign-key constraints: "$2" FOREIGN KEY (watch_list_id) REFERENCES watch_list(id) ON
UPDATE CASCADE ON DELETE CASCADE "$1" FOREIGN KEY (element_id) REFERENCES element(id) ON UPDATE
CASCADE ON DELETE CASCADE
freshports.org=#
I have a query which returns the needed results:
SELECT W.element_id FROM watch_list_element W WHERE w.watch_list_id in (select watch_list_id from
watch_list_element where element_id = 54968) GROUP BY W.element_id ORDER BY count(W.watch_list_id) DESC LIMIT 5;
But performance is an issue here. So I'm planning to calculate all
the possible values and cache them. That is, given each element_id in
a watch_list, what are the top 5 element_id values on all the lists
on which the original element_id appears?
I'm having trouble constructing the query. I'm not even sure I can
do this in one select, but that would be nice. Examples and clues
are appreciated.
Any ideas?
Thank you.
--
Dan Langille : http://www.langille.org/
BSDCan - The Technical BSD Conference - http://www.bsdcan.org/ NEW brochure available at
http://www.bsdcan.org/2005/advocacy/