Thread: Improving Full text performance

Improving Full text performance

From
xaviergxf
Date:
Hi,


   I´m using php and full text on postgresql 8.3 for indexing html
descriptions. I have no acess to postgresql server, since i use a
shared hosting service.
    To improve search and performance, i want to do the follow:

Strip all html tags then use my php script to remove more stop words
(because i can´t edit stop words file on the server).

My question: What i´m thinking to do, has any collateral effects? Any
suggestions?

Thanks!

Re: Improving Full text performance

From
Ries van Twisk
Date:
In these situations I would suggest to use a real (not that PG's FT is
not real...) search engine
like MNOGoSearch, lucene or others...

Ries

On Aug 21, 2009, at 9:56 PM, xaviergxf wrote:

> Hi,
>
>
>   I´m using php and full text on postgresql 8.3 for indexing html
> descriptions. I have no acess to postgresql server, since i use a
> shared hosting service.
>    To improve search and performance, i want to do the follow:
>
> Strip all html tags then use my php script to remove more stop words
> (because i can´t edit stop words file on the server).
>
> My question: What i´m thinking to do, has any collateral effects? Any
> suggestions?
>
> Thanks!
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


            regards, Ries van Twisk

-------------------------------------------------------------------------------------------------
tags: Freelance TYPO3 Glassfish JasperReports JasperETL Flex Blaze-DS
WebORB PostgreSQL DB-Architect
email: ries@vantwisk.nl        web:   http://www.rvantwisk.nl/
skype: callto://r.vantwisk
Phone: +1-810-476-4196    Cell: +593 9901 7694                   SIP:
+1-747-690-5133








Re: Improving Full text performance

From
Oleg Bartunov
Date:
On Fri, 21 Aug 2009, xaviergxf wrote:

> Hi,
>
>
>   I?m using php and full text on postgresql 8.3 for indexing html
> descriptions. I have no acess to postgresql server, since i use a
> shared hosting service.
>    To improve search and performance, i want to do the follow:
>
> Strip all html tags then use my php script to remove more stop words
> (because i can?t edit stop words file on the server).
>
> My question: What i?m thinking to do, has any collateral effects? Any
> suggestions?

You shouldn't bother to strip all html tags, just create your own text search
configuration, which index only what do you want. Read documentation for
details.


     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: Improving Full text performance

From
xaviergxf
Date:
If i strip all html tags and filter more stop words, will the search
be more accurate? Actually my fulltext stats returns some like: font
from <font> tags i guess, and other garbage.
 If i do that, will i improve the speed of my search?

Thanks!

Ps: I cannot use other tools like MNOsearch, lucene, etc...because i
have no root pass to my server.

On 22 ago, 02:20, o...@sai.msu.su (Oleg Bartunov) wrote:
> On Fri, 21 Aug 2009, xaviergxf wrote:
> > Hi,
>
> >   I?m using php and full text on postgresql 8.3 for indexing html
> > descriptions. I have no acess to postgresql server, since i use a
> > shared hosting service.
> >    To improve search and performance, i want to do the follow:
>
> > Strip all html tags then use my php script to remove more stop words
> > (because i can?t edit stop words file on the server).
>
> > My question: What i?m thinking to do, has any collateral effects? Any
> > suggestions?
>
> You shouldn't bother to strip all html tags, just create your own text search
> configuration, which index only what do you want. Read documentation for
> details.
>
>         Regards,
>                 Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: o...@sai.msu.su,http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
> --
> Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general


Re: Improving Full text performance

From
Oleg Bartunov
Date:
On Sat, 22 Aug 2009, xaviergxf wrote:

> If i strip all html tags and filter more stop words, will the search
> be more accurate? Actually my fulltext stats returns some like: font
> from <font> tags i guess, and other garbage.
>  If i do that, will i improve the speed of my search?

What do you mean 'accurate' ? You need be yourself a bit more 'accurate'
when asking:)  You need to provide more information about your problem.
For example, version of postgresql, size of collection you indexed,
explain analyze for your query, 'garbage' you got, etc.
This is not difficult - just copy'n paste work.

>
> Thanks!
>
> Ps: I cannot use other tools like MNOsearch, lucene, etc...because i
> have no root pass to my server.
>
> On 22 ago, 02:20, o...@sai.msu.su (Oleg Bartunov) wrote:
> > On Fri, 21 Aug 2009, xaviergxf wrote:
> > > Hi,
> >
> > > =A0 I?m using php and full text on postgresql 8.3 for indexing html
> > > descriptions. I have no acess to postgresql server, since i use a
> > > shared hosting service.
> > > =A0 =A0To improve search and performance, i want to do the follow:
> >
> > > Strip all html tags then use my php script to remove more stop words
> > > (because i can?t edit stop words file on the server).
> >
> > > My question: What i?m thinking to do, has any collateral effects? Any
> > > suggestions?
> >
> > You shouldn't bother to strip all html tags, just create your own text se=
> arch
> > configuration, which index only what do you want. Read documentation for
> > details.
> >
> > =A0 =A0 =A0 =A0 Regards,
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Oleg
> > _____________________________________________________________
> > Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> > Sternberg Astronomical Institute, Moscow University, Russia
> > Internet: o...@sai.msu.su,http://www.sai.msu.su/~megera/
> > phone: +007(495)939-16-83, +007(495)939-23-83
> >
> > --
> > Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
> > To make changes to your subscription:http://www.postgresql.org/mailpref/p=
> gsql-general
>
>
> --=20
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83