Re: people who buy A, also buy C, D, E - Mailing list pgsql-sql
From | Jan Wieck |
---|---|
Subject | Re: people who buy A, also buy C, D, E |
Date | |
Msg-id | 42BD6395.8070301@Yahoo.com Whole thread Raw |
In response to | Re: people who buy A, also buy C, D, E ("Greg Sabino Mullane" <greg@turnstep.com>) |
Responses |
Re: people who buy A, also buy C, D, E
|
List | pgsql-sql |
On 6/24/2005 11:35 PM, Greg Sabino Mullane wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > >> The goal of my query is: given a book, what did other people who >> bought this book also buy? I plan the list the 5 most popular such >> books. This sounds very much like one of the congestion problems given in the TPC-W benchmark. You might want to take a look at some of the published full disclosure reports or the PHP TPC-W implementation on pgfoundry to get some hints. Jan > > I've been playing with this a little bit, and I don't think you are > going to get better than you already have. Certainly, the caching > won't work either as any insert into the watch_list_element has > the potential to change a very large number of pre-compiled lists. > However, there are some minor optimizations that can be made to > speed up the existing query quite a bit. One note first: the LIMIT > should be 6 not 5 if you really want the five other "books" and the > book itself will more than likely appear in the list. Picking it > out is something the client app can do. > > * Make sure the tables are freshly analyzed. Might want to bump > up the default stats a bit too. > > * Looks like you already have indexes on the watch_list_element > table. The watch_list_element_element_id index could be broken > into multiple conditional indexes, but your explain shows this > would not really gain us much: > > actual time=37.957..41.789 > > * One big gain would be to cluster the table on watch_list_id: > > CREATE INDEX watch_index ON watch_list_element (watch_list_id); > CLUSTER watch_index ON watch_list_element; > > I got about a 25% speedup on my queries by doing this. YMMV, as I > don't know enough about your conditions to do more than make an > approximate test database. But it should help this query out. > > * Finally, you should upgrade if at all possible. Going from > 7.4.7 to 8.0.1 gave me a 10% speed increase, while going from > 8.0.1 to 8.1.0 (e.g. the upcoming version) gave me an additional > 25% speed boost, mostly due to the new bitmap stuff. So, making > the jump to 8.0.1 will be good practice for the 8.1.0 jump, right? :) > > Overall, I was able to get the query to go about a third faster > than when I started. Hope this helps. > > - -- > Greg Sabino Mullane greg@turnstep.com > PGP Key: 0x14964AC8 200506242328 > http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 > > -----BEGIN PGP SIGNATURE----- > > iD8DBQFCvNCrvJuQZxSWSsgRAmkDAJ44z/Ei27HuEBqx/htmCRHJZWi8VQCfV2mm > upeE0p3z4h11NJzl5aOqCkc= > =LVqI > -----END PGP SIGNATURE----- > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #