Thread: Improving Full text performance
Hi, I´m using php and full text on postgresql 8.3 for indexing html descriptions. I have no acess to postgresql server, since i use a shared hosting service. To improve search and performance, i want to do the follow: Strip all html tags then use my php script to remove more stop words (because i can´t edit stop words file on the server). My question: What i´m thinking to do, has any collateral effects? Any suggestions? Thanks!
In these situations I would suggest to use a real (not that PG's FT is not real...) search engine like MNOGoSearch, lucene or others... Ries On Aug 21, 2009, at 9:56 PM, xaviergxf wrote: > Hi, > > > I´m using php and full text on postgresql 8.3 for indexing html > descriptions. I have no acess to postgresql server, since i use a > shared hosting service. > To improve search and performance, i want to do the follow: > > Strip all html tags then use my php script to remove more stop words > (because i can´t edit stop words file on the server). > > My question: What i´m thinking to do, has any collateral effects? Any > suggestions? > > Thanks! > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general regards, Ries van Twisk ------------------------------------------------------------------------------------------------- tags: Freelance TYPO3 Glassfish JasperReports JasperETL Flex Blaze-DS WebORB PostgreSQL DB-Architect email: ries@vantwisk.nl web: http://www.rvantwisk.nl/ skype: callto://r.vantwisk Phone: +1-810-476-4196 Cell: +593 9901 7694 SIP: +1-747-690-5133
On Fri, 21 Aug 2009, xaviergxf wrote: > Hi, > > > I?m using php and full text on postgresql 8.3 for indexing html > descriptions. I have no acess to postgresql server, since i use a > shared hosting service. > To improve search and performance, i want to do the follow: > > Strip all html tags then use my php script to remove more stop words > (because i can?t edit stop words file on the server). > > My question: What i?m thinking to do, has any collateral effects? Any > suggestions? You shouldn't bother to strip all html tags, just create your own text search configuration, which index only what do you want. Read documentation for details. Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
If i strip all html tags and filter more stop words, will the search be more accurate? Actually my fulltext stats returns some like: font from <font> tags i guess, and other garbage. If i do that, will i improve the speed of my search? Thanks! Ps: I cannot use other tools like MNOsearch, lucene, etc...because i have no root pass to my server. On 22 ago, 02:20, o...@sai.msu.su (Oleg Bartunov) wrote: > On Fri, 21 Aug 2009, xaviergxf wrote: > > Hi, > > > I?m using php and full text on postgresql 8.3 for indexing html > > descriptions. I have no acess to postgresql server, since i use a > > shared hosting service. > > To improve search and performance, i want to do the follow: > > > Strip all html tags then use my php script to remove more stop words > > (because i can?t edit stop words file on the server). > > > My question: What i?m thinking to do, has any collateral effects? Any > > suggestions? > > You shouldn't bother to strip all html tags, just create your own text search > configuration, which index only what do you want. Read documentation for > details. > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), > Sternberg Astronomical Institute, Moscow University, Russia > Internet: o...@sai.msu.su,http://www.sai.msu.su/~megera/ > phone: +007(495)939-16-83, +007(495)939-23-83 > > -- > Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) > To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general
On Sat, 22 Aug 2009, xaviergxf wrote: > If i strip all html tags and filter more stop words, will the search > be more accurate? Actually my fulltext stats returns some like: font > from <font> tags i guess, and other garbage. > If i do that, will i improve the speed of my search? What do you mean 'accurate' ? You need be yourself a bit more 'accurate' when asking:) You need to provide more information about your problem. For example, version of postgresql, size of collection you indexed, explain analyze for your query, 'garbage' you got, etc. This is not difficult - just copy'n paste work. > > Thanks! > > Ps: I cannot use other tools like MNOsearch, lucene, etc...because i > have no root pass to my server. > > On 22 ago, 02:20, o...@sai.msu.su (Oleg Bartunov) wrote: > > On Fri, 21 Aug 2009, xaviergxf wrote: > > > Hi, > > > > > =A0 I?m using php and full text on postgresql 8.3 for indexing html > > > descriptions. I have no acess to postgresql server, since i use a > > > shared hosting service. > > > =A0 =A0To improve search and performance, i want to do the follow: > > > > > Strip all html tags then use my php script to remove more stop words > > > (because i can?t edit stop words file on the server). > > > > > My question: What i?m thinking to do, has any collateral effects? Any > > > suggestions? > > > > You shouldn't bother to strip all html tags, just create your own text se= > arch > > configuration, which index only what do you want. Read documentation for > > details. > > > > =A0 =A0 =A0 =A0 Regards, > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Oleg > > _____________________________________________________________ > > Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), > > Sternberg Astronomical Institute, Moscow University, Russia > > Internet: o...@sai.msu.su,http://www.sai.msu.su/~megera/ > > phone: +007(495)939-16-83, +007(495)939-23-83 > > > > -- > > Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) > > To make changes to your subscription:http://www.postgresql.org/mailpref/p= > gsql-general > > > --=20 > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83