Thread: search.postgresql.org
Hello, I was looking at our search configuration and I noted that out of a weight of 100 we are only weighing archives at 50 and varlena at 25. That seems a little low as archives is the primary source of pratical and real world issues and varlena is a great resource. Thoughts? Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
"Joshua D. Drake" <jd@commandprompt.com> writes: > I was looking at our search configuration and I noted that out of a > weight of 100 we are only weighing archives at 50 and varlena at 25. > That seems a little low as archives is the primary source of pratical > and real world issues and varlena is a great resource. Um, what other resources are rated higher? If you'll pardon my ignorance, what other resources is the search engine considering at all? regards, tom lane
On Sat, 25 Mar 2006, Tom Lane wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: >> I was looking at our search configuration and I noted that out of a >> weight of 100 we are only weighing archives at 50 and varlena at 25. >> That seems a little low as archives is the primary source of pratical >> and real world issues and varlena is a great resource. > > Um, what other resources are rated higher? If you'll pardon my > ignorance, what other resources is the search engine considering at all? And maybe a short explanation of what these 'weights' are all about? Does that just mean that when i search for something, archives results will always preceed varlena's? And, if so, and we are only doing archives vs varlena, as long as archives is weighted higher then varlena, does it matter what the weights are? Could be 1 + 2, no? ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
On Sat, 2006-03-25 at 02:24 -0400, Marc G. Fournier wrote: > On Sat, 25 Mar 2006, Tom Lane wrote: > > > "Joshua D. Drake" <jd@commandprompt.com> writes: > >> I was looking at our search configuration and I noted that out of a > >> weight of 100 we are only weighing archives at 50 and varlena at 25. > >> That seems a little low as archives is the primary source of pratical > >> and real world issues and varlena is a great resource. > > > > Um, what other resources are rated higher? If you'll pardon my > > ignorance, what other resources is the search engine considering at all? > > And maybe a short explanation of what these 'weights' are all about? Does > that just mean that when i search for something, archives results will > always preceed varlena's? And, if so, and we are only doing archives vs > varlena, as long as archives is weighted higher then varlena, does it > matter what the weights are? Could be 1 + 2, no? That's right, the siteweights are just for making sure results from one site is displayed before another. A search on search.postgresql.org/archives.search ONLY considers the archives anyways. search.postgresql.org/www.search, however, returns results from all the indexed sites, in order of relevance, with priority according to the siteweights. > > ---- > Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) > Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664 > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match
> Um, what other resources are rated higher? If you'll pardon my > ignorance, what other resources is the search engine considering at all? Below is the weights. As John said earlier they just decide who's links show up first. It seems to me that archives should be secondly only to www.postgresql.org. Once the new techdocs is live and we get some solid content on there, then we can push that up. Thoughts? SiteWeight http://www.postgresql.org/ 100 SiteWeight http://advocacy.postgresql.org/ 100 SiteWeight http://jdbc.postgresql.org/ 100 SiteWeight http://developer.postgresql.org/ 100 # Authoritiative project site SiteWeight http://gborg.postgresql.org/ 75 SiteWeight http://pgadmin.postgresql.org/ 75 SiteWeight http://phppgadmin.sourceforge.net/ 75 SiteWeight http://pgfoundry.org/ 75 # User contributed stuff SiteWeight http://techdocs.postgresql.org/ 50 SiteWeight http://archives.postgresql.org/ 50 # Outside but reliable SiteWeight http://www.varlena.com/ 25 # And the rest... SiteWeight http://www.postgresql.cl/ 0 SiteWeight http://postgresql.ok.cz/ 0 SiteWeight http://www.postgresql.jp/ 0 SiteWeight http://www.postgresqlfr.org/ 0 SiteWeight http://www.linuxshare.ru/ 0 SiteWeight http://www.postgres.de/ 0 SiteWeight http://www.pgsqldb.org/ 0 SiteWeight http://www.postgresql.org.br/ 0 -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
-----Original Message----- From: "Joshua D. Drake"<jd@commandprompt.com> Sent: 25/03/06 16:53:27 To: "Tom Lane"<tgl@sss.pgh.pa.us> Cc: "PostgreSQL WWW"<pgsql-www@postgresql.org> Subject: Re: [pgsql-www] search.postgresql.org > It seems to me that archives should be secondly only to > www.postgresql.org. I disagree - the web search is intended to favour factual sites over the archives which may contain incorrect info. If youwant to search the archives only you still can. > Once the new techdocs is live and we get some solid > content on there, then we can push that up. That is on www so will have the same rating. /D -----Unmodified Original Message----- > Um, what other resources are rated higher? If you'll pardon my > ignorance, what other resources is the search engine considering at all? Below is the weights. As John said earlier they just decide who's links show up first. It seems to me that archives should be secondly only to www.postgresql.org. Once the new techdocs is live and we get some solid content on there, then we can push that up. Thoughts? SiteWeight http://www.postgresql.org/ 100 SiteWeight http://advocacy.postgresql.org/ 100 SiteWeight http://jdbc.postgresql.org/ 100 SiteWeight http://developer.postgresql.org/ 100 # Authoritiative project site SiteWeight http://gborg.postgresql.org/ 75 SiteWeight http://pgadmin.postgresql.org/ 75 SiteWeight http://phppgadmin.sourceforge.net/ 75 SiteWeight http://pgfoundry.org/ 75 # User contributed stuff SiteWeight http://techdocs.postgresql.org/ 50 SiteWeight http://archives.postgresql.org/ 50 # Outside but reliable SiteWeight http://www.varlena.com/ 25 # And the rest... SiteWeight http://www.postgresql.cl/ 0 SiteWeight http://postgresql.ok.cz/ 0 SiteWeight http://www.postgresql.jp/ 0 SiteWeight http://www.postgresqlfr.org/ 0 SiteWeight http://www.linuxshare.ru/ 0 SiteWeight http://www.postgres.de/ 0 SiteWeight http://www.pgsqldb.org/ 0 SiteWeight http://www.postgresql.org.br/ 0 -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
On Saturday 25 March 2006 11:58, Joshua D. Drake wrote: > > Um, what other resources are rated higher? If you'll pardon my > > ignorance, what other resources is the search engine considering at all? > > Below is the weights. As John said earlier they just decide who's links > show up first. > > It seems to me that archives should be secondly only to > www.postgresql.org. Once the new techdocs is live and we get some solid > content on there, then we can push that up. > > Thoughts? > > # User contributed stuff > SiteWeight http://techdocs.postgresql.org/ 50 > SiteWeight http://archives.postgresql.org/ 50 > Should planetpostgresql.org get added to this section? Also maybe Oleg/Teodor could wiegh in with which sites they are currently indexing in pgsql.ru ? -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
I don't understand all this thread. Manual intervention brings to nothing all search engine ranking and usually produce very bad results. Is't possible to use citations ? Also, is't possible to group results by sites ? This could soften effect of incorrect ranking. At www.pgsql.ru/db/pgsearch we have no manual weight correction and trust authors of web pages. Oleg On Sat, 25 Mar 2006, Dave Page wrote: > > > -----Original Message----- > From: "Joshua D. Drake"<jd@commandprompt.com> > Sent: 25/03/06 16:53:27 > To: "Tom Lane"<tgl@sss.pgh.pa.us> > Cc: "PostgreSQL WWW"<pgsql-www@postgresql.org> > Subject: Re: [pgsql-www] search.postgresql.org > >> It seems to me that archives should be secondly only to >> www.postgresql.org. > > I disagree - the web search is intended to favour factual sites over the archives which may contain incorrect info. Ifyou want to search the archives only you still can. > >> Once the new techdocs is live and we get some solid >> content on there, then we can push that up. > > That is on www so will have the same rating. > > /D > > -----Unmodified Original Message----- > >> Um, what other resources are rated higher? If you'll pardon my >> ignorance, what other resources is the search engine considering at all? > > Below is the weights. As John said earlier they just decide who's links > show up first. > > It seems to me that archives should be secondly only to > www.postgresql.org. Once the new techdocs is live and we get some solid > content on there, then we can push that up. > > Thoughts? > > > > SiteWeight http://www.postgresql.org/ 100 > SiteWeight http://advocacy.postgresql.org/ 100 > SiteWeight http://jdbc.postgresql.org/ 100 > SiteWeight http://developer.postgresql.org/ 100 > > # Authoritiative project site > SiteWeight http://gborg.postgresql.org/ 75 > SiteWeight http://pgadmin.postgresql.org/ 75 > SiteWeight http://phppgadmin.sourceforge.net/ 75 > SiteWeight http://pgfoundry.org/ 75 > > # User contributed stuff > SiteWeight http://techdocs.postgresql.org/ 50 > SiteWeight http://archives.postgresql.org/ 50 > > # Outside but reliable > SiteWeight http://www.varlena.com/ 25 > > # And the rest... > SiteWeight http://www.postgresql.cl/ 0 > SiteWeight http://postgresql.ok.cz/ 0 > SiteWeight http://www.postgresql.jp/ 0 > SiteWeight http://www.postgresqlfr.org/ 0 > SiteWeight http://www.linuxshare.ru/ 0 > SiteWeight http://www.postgres.de/ 0 > SiteWeight http://www.pgsqldb.org/ 0 > SiteWeight http://www.postgresql.org.br/ 0 > > > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
On Sat, 25 Mar 2006, Robert Treat wrote: > On Saturday 25 March 2006 11:58, Joshua D. Drake wrote: >>> Um, what other resources are rated higher? If you'll pardon my >>> ignorance, what other resources is the search engine considering at all? >> >> Below is the weights. As John said earlier they just decide who's links >> show up first. >> >> It seems to me that archives should be secondly only to >> www.postgresql.org. Once the new techdocs is live and we get some solid >> content on there, then we can push that up. >> >> Thoughts? >> >> # User contributed stuff >> SiteWeight http://techdocs.postgresql.org/ 50 >> SiteWeight http://archives.postgresql.org/ 50 >> > > Should planetpostgresql.org get added to this section? > > Also maybe Oleg/Teodor could wiegh in with which sites they are currently > indexing in pgsql.ru ? we have 78 sites indexed, see http://pgsql.ru/db/pgsearch/stat.html and do nothing with weights. > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
-----Original Message----- From: "Oleg Bartunov"<oleg@sai.msu.su> Sent: 25/03/06 20:49:25 To: "Dave Page"<dpage@vale-housing.co.uk> Cc: "jd@commandprompt.com"<jd@commandprompt.com>, "tgl@sss.pgh.pa.us"<tgl@sss.pgh.pa.us>, "pgsql-www@postgresql.org"<pgsql-www@postgresql.org> Subject: Re: [pgsql-www] search.postgresql.org > Also, is't possible to group results by sites It does, when appropriate. > ? This could soften effect of incorrect > ranking. At www.pgsql.ru/db/pgsearch we > have no manual weight correction and > trust authors of web pages. The config being shown is the one used for the main www.postgresql.org site search, thus results from that site are favouredover external sites. The archives are ranked even lower because of the potential for additional noise. www.pgsql.ru is a generic search where lack of weightings and trusting authors makes sense. search.postgresql.org providessite searches aimed at users coming from different perspectives of our site network and we compliment the site resultswith the external results for added value. Regards, Dave -----Unmodified Original Message----- I don't understand all this thread. Manual intervention brings to nothing all search engine ranking and usually produce very bad results. Is't possible to use citations ? Also, is't possible to group results by sites ? This could soften effect of incorrect ranking. At www.pgsql.ru/db/pgsearch we have no manual weight correction and trust authors of web pages. Oleg On Sat, 25 Mar 2006, Dave Page wrote: > > > -----Original Message----- > From: "Joshua D. Drake"<jd@commandprompt.com> > Sent: 25/03/06 16:53:27 > To: "Tom Lane"<tgl@sss.pgh.pa.us> > Cc: "PostgreSQL WWW"<pgsql-www@postgresql.org> > Subject: Re: [pgsql-www] search.postgresql.org > >> It seems to me that archives should be secondly only to >> www.postgresql.org. > > I disagree - the web search is intended to favour factual sites over the archives which may contain incorrect info. Ifyou want to search the archives only you still can. > >> Once the new techdocs is live and we get some solid >> content on there, then we can push that up. > > That is on www so will have the same rating. > > /D > > -----Unmodified Original Message----- > >> Um, what other resources are rated higher? If you'll pardon my >> ignorance, what other resources is the search engine considering at all? > > Below is the weights. As John said earlier they just decide who's links > show up first. > > It seems to me that archives should be secondly only to > www.postgresql.org. Once the new techdocs is live and we get some solid > content on there, then we can push that up. > > Thoughts? > > > > SiteWeight http://www.postgresql.org/ 100 > SiteWeight http://advocacy.postgresql.org/ 100 > SiteWeight http://jdbc.postgresql.org/ 100 > SiteWeight http://developer.postgresql.org/ 100 > > # Authoritiative project site > SiteWeight http://gborg.postgresql.org/ 75 > SiteWeight http://pgadmin.postgresql.org/ 75 > SiteWeight http://phppgadmin.sourceforge.net/ 75 > SiteWeight http://pgfoundry.org/ 75 > > # User contributed stuff > SiteWeight http://techdocs.postgresql.org/ 50 > SiteWeight http://archives.postgresql.org/ 50 > > # Outside but reliable > SiteWeight http://www.varlena.com/ 25 > > # And the rest... > SiteWeight http://www.postgresql.cl/ 0 > SiteWeight http://postgresql.ok.cz/ 0 > SiteWeight http://www.postgresql.jp/ 0 > SiteWeight http://www.postgresqlfr.org/ 0 > SiteWeight http://www.linuxshare.ru/ 0 > SiteWeight http://www.postgres.de/ 0 > SiteWeight http://www.pgsqldb.org/ 0 > SiteWeight http://www.postgresql.org.br/ 0 > > > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Dave Page wrote: > The config being shown is the one used for the main www.postgresql.org site search, thus results from that site are favouredover external sites. The archives are ranked even lower because of the potential for additional noise. Speaking of that, I noticed a problem a few days ago related to the scope of the search. You can reproduce it this way: http://www.postgresql.org/docs/8.1/interactive/index.html Search for alter table for example: http://search.postgresql.org/www.search?ul=http%3A%2F%2Fwww.postgresql.org%2Fdocs%2F8.1%2Finteractive%2F%25&fm=on&cs=utf-8&q=ALTER+TABLE You have 91 results and especially the results from the documentation. But when you click on the search button again on the results page, the ul parameter is not here anymore and you have pretty bad results: http://search.postgresql.org/www.search?cs=utf-8&fm=on&gr=on&o=0&ps=20&s=rate&q=ALTER+TABLE We have only 15 results and the documentation is not there. So we should probably keep the ul parameter in the new form when set in the query string but I don't understand why we don't have the doc results when ul is not there and I suspect the search should be global in this case. Regards, -- Guillaume
-----Original Message----- From: Guillaume Smet [mailto:guillaume-pg@smet.org] Sent: Sun 3/26/2006 10:17 AM To: Dave Page Cc: oleg@sai.msu.su; jd@commandprompt.com; tgl@sss.pgh.pa.us; pgsql-www@postgresql.org Subject: Re: [pgsql-www] search.postgresql.org > But when you click on the search button again on the results page, the > ul parameter is not here anymore and you have pretty bad results: Hm, no, it does seem to vanish. John, is that one of those template formatting buglets? > http://search.postgresql.org/www.search?cs=utf-8&fm=on&gr=on&o=0&ps=20&s=rate&q=ALTER+TABLE > We have only 15 results and the documentation is not there. What you see there is the site grouping coming into effect. The docs are all on www.postgresql.org, which due to the weightingsdoes get listed first, however, it seems that Bruce's book are the 2 options it shows from there (probably theymention the complete phrase most often or similar). You then see a couple of results from each of the other sites withhits. Under each should be a 'Show more from this site' link. Click that on the www results, and you will see the documentationhits form other parts of that site. Regards, Dave.
On Sun, Mar 26, 2006 at 12:00:34AM +0300, Oleg Bartunov wrote: > we have 78 sites indexed, see http://pgsql.ru/db/pgsearch/stat.html > and do nothing with weights. Man, that's damn handy! We should really have something like that on the main site, perhaps a "search the official sites" and "search everything". -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
we have 78 sites indexed, see http://pgsql.ru/db/pgsearch/stat.html
and do nothing with weights.
Wow! That is really really fast and good. I would love to see this on Postgresql main page (maybe linked), and ...
you do that with tsearch2 ?
Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
PostgreSQL - supported by a community that does not put you on hold