Thread: Search engine
In case people didn't notice from the website, or from the commits going in, I have now finally activated the new tsearch2 based search engine on search.postgresql.org. All code is in cvs, so if you want to improve on it, go right ahead :-) //Magnus
On Mon, 2006-12-18 at 21:19 +0100, Magnus Hagander wrote: > In case people didn't notice from the website, or from the commits going > in, I have now finally activated the new tsearch2 based search engine > on search.postgresql.org. > > All code is in cvs, so if you want to improve on it, go right ahead :-) I would like to take a moment and thank Magnus for his hard work on this. search.postgresql.org is now humming along quite nicely since we are using a PostgreSQL native backend. Joshua D. Drake > > > //Magnus > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
On Mon, 18 Dec 2006, Magnus Hagander wrote: > In case people didn't notice from the website, or from the commits going > in, I have now finally activated the new tsearch2 based search engine > on search.postgresql.org. > > All code is in cvs, so if you want to improve on it, go right ahead :-) Just a thought. I'd think about adding to rank a portion of title, when search PostgreSQL documentation, so 'create table' would return latest version (8.2) first. Or just use creation date. Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Two things: Navigate to http://www.postgresql.org/docs/8.2/static/index.html and search for 'create table' Pages 1-20 of 247. Click 'next' Pages 21-40 of more than 1000. Obviously, the hyperlinks need to include the &u=/docs/8.2/static/ Archives search is slow. (5+ seconds to search all lists) Apart from that, bloody good work. Thumbs up. ... John > -----Original Message----- > From: pgsql-www-owner@postgresql.org > [mailto:pgsql-www-owner@postgresql.org] On Behalf Of Magnus Hagander > Sent: Tuesday, December 19, 2006 7:20 AM > To: pgsql-www@postgresql.org > Subject: [pgsql-www] Search engine > > In case people didn't notice from the website, or from the > commits going in, I have now finally activated the new > tsearch2 based search engine on search.postgresql.org. > > All code is in cvs, so if you want to improve on it, go right > ahead :-) > > > //Magnus > > ---------------------------(end of > broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings >
Magnus Hagander wrote: > In case people didn't notice from the website, or from the commits going > in, I have now finally activated the new tsearch2 based search engine > on search.postgresql.org. Nice work - glad to see it's up and running now :-) So, to keep you on your toes, here's a couple of bugs :-p - On the archives search, the pgAdmin lists are under Community lists, as are sfpug and sydpug. I would expect to see the latter two under Regional, and the pgAdmin ones under Project lists. The list should probably mirror the layout on the archves index page. - Site restrictions are lost in the page links. For example, search within a document set, and the first page returned is correctly limited to that docset. Click on a page link, and the restriction is lost. Regards, Dave.
On Tue, Dec 19, 2006 at 08:53:45AM +0300, Oleg Bartunov wrote: > On Mon, 18 Dec 2006, Magnus Hagander wrote: > > >In case people didn't notice from the website, or from the commits going > >in, I have now finally activated the new tsearch2 based search engine > >on search.postgresql.org. > > > >All code is in cvs, so if you want to improve on it, go right ahead :-) > > Just a thought. > I'd think about adding to rank a portion of title, when search PostgreSQL > documentation, so 'create table' would return latest version (8.2) first. > Or just use creation date. It's on my TODO to be able to do "suburl weighing", which would then mean that we could rank say /docs/current/ higher than anything else, etc. Just not done yet, I wanted to get the basics out there first. //Magnus
On Tue, Dec 19, 2006 at 07:01:34PM +1100, John Hansen wrote: > Two things: > > Navigate to http://www.postgresql.org/docs/8.2/static/index.html and search for 'create table' > > Pages 1-20 of 247. > > Click 'next' > > Pages 21-40 of more than 1000. > > Obviously, the hyperlinks need to include the &u=/docs/8.2/static/ Should be fixed now, thanks for pointing it out. > Archives search is slow. (5+ seconds to search all lists) Depends on what you search for ;-) What did you search for? The problem is that the search is very fast (single-digit millisecond much of time), but if you get several thousand hits it takes a while to sort them. It could also be that you search for something that had been forced out of the cache - happens for example when the backups run. Then the first search takes a while, but it goes faster with time. > Apart from that, bloody good work. Thumbs up. Thanks! //Magnus
On Tue, Dec 19, 2006 at 08:22:13AM +0000, Dave Page wrote: > Magnus Hagander wrote: > >In case people didn't notice from the website, or from the commits going > > in, I have now finally activated the new tsearch2 based search engine > >on search.postgresql.org. > > Nice work - glad to see it's up and running now :-) > > So, to keep you on your toes, here's a couple of bugs :-p But of course ;-) > - On the archives search, the pgAdmin lists are under Community lists, > as are sfpug and sydpug. I would expect to see the latter two under > Regional, and the pgAdmin ones under Project lists. The list should > probably mirror the layout on the archves index page. It all lives in the database, so that's an easy fix. Are you saying that the "Community lists" header should go away completely? I have moved the regional ones already, will do pgadmin when you confirm that :-) > - Site restrictions are lost in the page links. For example, search > within a document set, and the first page returned is correctly limited > to that docset. Click on a page link, and the restriction is lost. Fixed now. //Magnus
Magnus Hagander wrote: > On Tue, Dec 19, 2006 at 08:22:13AM +0000, Dave Page wrote: >> Magnus Hagander wrote: >>> In case people didn't notice from the website, or from the commits going >>> in, I have now finally activated the new tsearch2 based search engine >>> on search.postgresql.org. >> Nice work - glad to see it's up and running now :-) >> >> So, to keep you on your toes, here's a couple of bugs :-p > > But of course ;-) > >> - On the archives search, the pgAdmin lists are under Community lists, >> as are sfpug and sydpug. I would expect to see the latter two under >> Regional, and the pgAdmin ones under Project lists. The list should >> probably mirror the layout on the archves index page. > > It all lives in the database, so that's an easy fix. Are you saying that > the "Community lists" header should go away completely? > > I have moved the regional ones already, will do pgadmin when you confirm > that :-) I'm saying the structure should follow that of the side menu on http://archives.postgresql.org/ # User lists * pgsql-admin * pgsql-advocacy * pgsql-announce * pgsql-bugs * pgsql-docs * pgsql-cygwin * pgsql-general * pgsql-interfaces * pgsql-jdbc * pgsql-jobs * pgsql-novice * pgsql-odbc * pgsql-performance * pgsql-php * pgsql-ports * pgsql-sql # Developer lists * pgsql-committers * pgsql-hackers * pgsql-patches * pgsql-www # Regional lists * pgsql-de-allgemein * pgsql-es-ayuda * pgsql-fr-generale * pgsql-ru-general * pgsql-tr-genel # Project lists * pgadmin-hackers * pgadmin-support # User groups * San Francisco * Sydney # Inactive lists * pgsql-benchmarks * pgsql-chat * pgsql-hackers- win32 So I was wrong about the PUGs - they have their own section. Sorry :-) /D
Magnus Hagander Wrote: > > Archives search is slow. (5+ seconds to search all lists) > > Depends on what you search for ;-) What did you search for? 'create table' > The problem is that the search is very fast (single-digit > millisecond much of time), but if you get several thousand > hits it takes a while to sort them. > It could also be that you search for something that had been > forced out of the cache - happens for example when the > backups run. Then the first search takes a while, but it goes > faster with time. > > > > Apart from that, bloody good work. Thumbs up. > > Thanks! > > //Magnus >
John Hansen wrote: > Magnus Hagander Wrote: > >>> Archives search is slow. (5+ seconds to search all lists) >> Depends on what you search for ;-) What did you search for? > > 'create table' Yeah, that one is definitely the sort. Just the search over all lists for create table takes about 170ms and returns about 6000 hits. Calculating rank value and sorting it takes the rest :( //Magnus
On Tuesday 19 December 2006 17:13, Magnus Hagander wrote: > John Hansen wrote: > > Magnus Hagander Wrote: > >>> Archives search is slow. (5+ seconds to search all lists) > >> > >> Depends on what you search for ;-) What did you search for? > > > > 'create table' > > Yeah, that one is definitely the sort. Just the search over all lists > for create table takes about 170ms and returns about 6000 hits. > Calculating rank value and sorting it takes the rest :( > Maybe we could add a "click here to see the explain analyze of your search query" button. :-) -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
On Thu, 21 Dec 2006, Robert Treat wrote: > On Tuesday 19 December 2006 17:13, Magnus Hagander wrote: >> John Hansen wrote: >>> Magnus Hagander Wrote: >>>>> Archives search is slow. (5+ seconds to search all lists) >>>> >>>> Depends on what you search for ;-) What did you search for? >>> >>> 'create table' >> >> Yeah, that one is definitely the sort. Just the search over all lists >> for create table takes about 170ms and returns about 6000 hits. >> Calculating rank value and sorting it takes the rest :( >> > > Maybe we could add a "click here to see the explain analyze of your search > query" button. :-) Just to make clear the problem, why ordinary SE doesn't have such problem - it'so because all information needed for ranking is available from index itself. Database based SE, like tsearch2, should be able to work even without an index, so after search is done, which is very fast, we need to consult heap to get positional information, weights, etc. Also, since GiST index is lossy, we must check heap to exclude false hits. I'm wondering if we could store positional information in GiN index, which is not lossy ! Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Dave Page wrote: >>> - On the archives search, the pgAdmin lists are under Community >>> lists, as are sfpug and sydpug. I would expect to see the latter two >>> under Regional, and the pgAdmin ones under Project lists. The list >>> should probably mirror the layout on the archves index page. >> >> It all lives in the database, so that's an easy fix. Are you saying that >> the "Community lists" header should go away completely? >> >> I have moved the regional ones already, will do pgadmin when you confirm >> that :-) > > I'm saying the structure should follow that of the side menu on > http://archives.postgresql.org/ Updated per this structure. //Magnus