Re: VACUUM FULL - Mailing list pgsql-admin
From | Rajesh Kumar Mallah. |
---|---|
Subject | Re: VACUUM FULL |
Date | |
Msg-id | 200205121054.06218.mallah@trade-india.com Whole thread Raw |
In response to | Re: VACUUM FULL (Brian McCane <bmccane@mccons.net>) |
List | pgsql-admin |
Hi Brian, Yes contrib/tsearch does not provide relevence ranking at the moment, But OpenFTS-perl/tcl (openfts.sf.net) (from which tsearch is derieved) do provide relevence ranking and those are also very fast. and they work in similar fashion as you have metioned in your reply, also OpenFTS does stemming in you search phrases. but i understand applicability of those solutions are also subject to your current requirements. regds mallah. On Saturday 11 May 2002 10:23 pm, Brian McCane wrote: > I have tried contrib/tsearch, but I do fairly complex score computation > based on word placement (headers,body,graphics,metatags, etc). And also > do additional computations on the scores when the user does a search. It > is very quick to use something like tsearch to figure out which pages > match my criterion, but very slow to then compute a valid score for each > page. Instead, in an attempt to speed things up, a score is computed for > each distinct word on a page, and that score is stored in this table. > Then when a search is requested, I use previously existing search > information and data from this table to compute the new results. > > The upshot of all this computing is that I get fairly relevant results, > but I have to do a lot of math to get there. I have been considering > combining the two methods, doing something like a full text search to find > pages that meet my criterion, and then using this table to actually > compute a pages score. > > - brian > > On Sat, 11 May 2002, Rajesh Kumar Mallah. wrote: > > Hi Brian , > > > > are you performing full text search in any case? > > > > Apart from optimizing the TABLE/INDEXES (thru VACUUM measures) > > > > i feel using text indexes provided by contrib/tsearch can also > > lead to significant improvement in search performance. > > > > regds > > mallah. > > > > On Saturday 11 May 2002 05:48 am, Brian McCane wrote: > > > Okay, I guess I misunderstood something about "VACUUM FULL". My > > > understanding was that a VACUUM (without FULL), marked unused records > > > for reuse. VACUUM FULL moved records from the end of a file to the > > > holes where unused records were stored and truncated the file to free > > > disk space. So far so good, but.... > > > > > > I have had continued loss of performance on one of my largest tables > > > (600,000,000 records). There are 4 fields in a record like such: > > > > > > CREATE TABLE foo ( > > > a int not null references bar(a) on delete cascade on update no > > > action, b int not null references baz(b) on delete cascade on update no > > > action, c int, > > > d smallint, > > > primary key(a, b)) ; > > > CREATE INDEX foo_ac ON foo (a,c) ; > > > CREATE INDEX foo_ad on foo (a,d) ; > > > And there are 3 triggers which fire before insert/delete/update. > > > > > > I INSERT/UPDATE/DELETE approximately 300,000 records per day, but this > > > number is increasing on a daily basis as I make changes which improve > > > the performance of my data gathering tools (spiders ;). Two days ago, > > > it had reached the point where a search for a 3-word term (ie. free > > > news servers) took about 60 seconds. I have just spent 48 hours > > > running a VACUUM FULL on my table, and now the same search takes < 10 > > > seconds. I assume that the increase in performance is due to the > > > decrease in table/index size which added up to approximate 1GB of freed > > > space on the machine, which was approximately 4% of the original size > > > of the table and all its indices. But, a 4% decrease in size should not > > > add up to a 84% increase in performance (is that right? I always get > > > the ratio confused :). > > > > > > If all that VACUUM FULL did was move records from file 12345678.6 to > > > file 12345678, the database would still being doing a large number of > > > random accesses on the table. However, if VACUUM FULL clusters the > > > data according to the primary key, it would still be doing a large > > > number of random access on the table, because the primary key has > > > almost nothing to do with how I actually access the data in real life. > > > So, is VACUUM FULL looking somewhere in pg_statistics (or > > > pg_stat_user_indexes), to determine which index I actually use most > > > (foo_ad), and then clustering the data that way, or is there some other > > > agent at work here. > > > > > > - brian > > > > > > > > > Wm. Brian McCane | Life is full of doors that won't > > > open Search http://recall.maxbaud.net/ | when you knock, equally > > > spaced amid those Usenet http://freenews.maxbaud.net/ | that open when > > > you don't want them to. Auction http://www.sellit-here.com/ | - Roger > > > Zelazny "Blood of Amber" > > > > > > Rajesh Kumar Mallah, Project Manager (Development) Infocom Network Limited, New Delhi phone: +91(11)6152172 (221) (L) ,9811255597 (M) Visit http://www.trade-india.com , India's Leading B2B eMarketplace.
pgsql-admin by date: