Re: PostgreSQL caching - Mailing list pgsql-performance
From | Vitaly Belman |
---|---|
Subject | Re: PostgreSQL caching |
Date | |
Msg-id | 1371146711781.20040525225305@012.net.il Whole thread Raw |
In response to | Re: PostgreSQL caching (Jochem van Dieten <jochemd@oli.tudelft.nl>) |
Responses |
Re: PostgreSQL caching
|
List | pgsql-performance |
Hello Jochem and Marty, I guess I should have posted the table structure before =(: Table structure + Indexes ------------------------- CREATE TABLE public.bv_books ( book_id serial NOT NULL, book_title varchar(255) NOT NULL, series_id int4, series_index int2, annotation_desc_id int4, description_desc_id int4, book_picture varchar(255) NOT NULL, vote_avg float4 NOT NULL, vote_count int4 NOT NULL, CONSTRAINT bv_books_pkey PRIMARY KEY (book_id) ) WITH OIDS; CREATE INDEX i_books_vote_avg ON public.bv_books USING btree (vote_avg); CREATE INDEX i_books_vote_count ON public.bv_books USING btree (vote_count); ------------------------- CREATE TABLE public.bv_bookgenres ( book_id int4 NOT NULL, genre_id int4 NOT NULL, CONSTRAINT bv_bookgenres_pkey PRIMARY KEY (book_id, genre_id), CONSTRAINT fk_bookgenres_book_id FOREIGN KEY (book_id) REFERENCES public.bv_books (book_id) ON UPDATE RESTRICT ON DELETERESTRICT ) WITH OIDS; CREATE INDEX i_bookgenres_book_id ON public.bv_bookgenres USING btree (book_id); CREATE INDEX i_bookgenres_genre_id ON public.bv_bookgenres USING btree (genre_id); ------------------------- MS> I didn't see the table structure, but I assume that the vote_avg and MS> vote_count fields are in bv_bookgenres. If no fields are actually MS> needed from bv_bookgenres, then the query might be constructed in a way MS> that only the index would be read, without loading any row data. I didn't understand you. vote_avg is stored in bv_books.. So yes, the only thing I need from bv_bookgenres is the id of the book, but I can't store this info in bv_books because there is N to N relationship between them - every book can belong to a number of genres... If that's what you meant. MS> I think that you mentioned this was for a web app. Do you actually have MS> a web page that displays 2000 rows of data? Well.. It is all "paginated", you can access 2000 items of the data (as there are actually 2000 books in the genre) but you only see 10 items at a time.. I mean, probably no one would go over the 2000 books, but I can't just hide them =\. JvD> Presuming that vote_avg is a field in the table bv_bookgenres, JvD> try a composite index on genre_id and vote_avg and then see if JvD> you can use the limit clause to reduce the number of loop JvD> iterations from 1993 to 10. I'm afraid your idea is invalid in my case =\... Naturally I could eventually do data coupling to gain perforemnce boost if this issue will not be solved in other ways. I'll keep your idea in mind anyway, thanks. Once again thanks for you feedback. Regards, Vitaly Belman ICQ: 1912453 AIM: VitalyB1984 MSN: tmdagent@hotmail.com Yahoo!: VitalyBe Tuesday, May 25, 2004, 6:37:44 PM, you wrote: JvD> Vitaly Belman wrote: >> >> If you'll be so kind though, I'd be glad if you could spot anything to >> speed up in this query. Here's the query and its plan that happens >> without any caching: >> >> ------------------------------------------------------------------------------------------------------------- >> QUERY >> ----- >> SELECT bv_books. * , >> vote_avg, >> vote_count >> FROM bv_bookgenres, >> bv_books >> WHERE bv_books.book_id = bv_bookgenres.book_id AND >> bv_bookgenres.genre_id = 5830 >> ORDER BY vote_avg DESC LIMIT 10 OFFSET 0; >> >> QUERY PLAN >> ---------- >> Limit (cost=2337.41..2337.43 rows=10 width=76) (actual >> time=7875.000..7875.000 rows=10 loops=1) >> -> Sort (cost=2337.41..2337.94 rows=214 width=76) (actual >> time=7875.000..7875.000 rows=10 loops=1) >> Sort Key: bv_books.vote_avg >> -> Nested Loop (cost=0.00..2329.13 rows=214 width=76) >> (actual time=16.000..7844.000 rows=1993 loops=1) >> -> Index Scan using i_bookgenres_genre_id on >> bv_bookgenres (cost=0.00..1681.54 rows=214 width=4) (actual >> time=16.000..3585.000 rows=1993 loops=1) >> Index Cond: (genre_id = 5830) >> -> Index Scan using bv_books_pkey on bv_books >> (cost=0.00..3.01 rows=1 width=76) (actual time=2.137..2.137 rows=1 >> loops=1993) >> Index Cond: (bv_books.book_id = "outer".book_id) >> Total runtime: 7875.000 ms JvD> Presuming that vote_avg is a field in the table bv_bookgenres, JvD> try a composite index on genre_id and vote_avg and then see if JvD> you can use the limit clause to reduce the number of loop JvD> iterations from 1993 to 10. JvD> CREATE INDEX test_idx ON bv_bookgenres (genre_id, vote_avg); JvD> The following query tries to force that execution lan and, JvD> presuming there is a foreign key relation between JvD> bv_books.book_id AND bv_bookgenres.book_id, I expect it will give JvD> the same results, but be carefull with NULL's: JvD> SELECT bv_books. * , JvD> vote_avg, JvD> vote_count JvD> FROM ( JvD> SELECT bg.* JvD> FROM bv_bookgenres bg JvD> WHERE bg.genre_id = 5830 JvD> ORDER BY JvD> bg.vote_avg DESC JvD> LIMIT 10 JvD> ) bv_bookgenres, JvD> bv_books JvD> WHERE bv_books.book_id = bv_bookgenres.book_id JvD> ORDER BY JvD> vote_avg DESC JvD> LIMIT 10; JvD> Jochem
pgsql-performance by date: