Thread: Let's play bash the search engine

Let's play bash the search engine

From
"Joshua D. Drake"
Date:
Hello,

search.postgresql.org is now served directly from PostgreSQL 8.2 ,
Tsearch2 and GIN. We have been testing thoroughly for the last couple of
weeks but of course... it is now open to the general public.

Take a look at let us know what you think and how it performs for you.

Sincerely,

Joshua D. Drake

--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




Re: Let's play bash the search engine

From
"Thomas H."
Date:
> Take a look at let us know what you think and how it performs for you.

i would love an advanced search where you can limit the results to a
particular version of the documentation. the query for "SELECT" returns too
many results from too many versions, obviously.

its fast & quick tho :-)

regards,
thomas



Re: Let's play bash the search engine

From
Jorge Godoy
Date:
"Thomas H." <me@alternize.com> writes:

> i would love an advanced search where you can limit the results to a
> particular version of the documentation. the query for "SELECT" returns too
> many results from too many versions, obviously.

+1 on that.

> its fast & quick tho :-)

Indeed.


Be seeing you,
--
Jorge Godoy      <jgodoy@gmail.com>

Re: Let's play bash the search engine

From
Reece Hart
Date:
On Mon, 2006-12-18 at 15:47 -0800, Joshua D. Drake wrote:
Take a look at let us know what you think and how it performs for you.

Terrific. Fast and meaningful.

I echo Thomas' request to have docs limited to a version (or, better, most recent). Perhaps archived docs should be searched via a separate page entirely.

Most the queries I did hit what I expected, except that the docs were for old versions. (In fact, I don't think 8.2 docs ever showed up first.)

I tried "defer constraints" and got a few not-too-useful hits. However, "deferred constraints" returned meaningful links. Is that a stemmer problem?

-Reece

-- 
Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush
kernel warning: universe consuming too many resources. Killing.
universe killed due to catastrophic leadership. Try -uspres carter.

Re: Let's play bash the search engine

From
Henrik Zagerholm
Date:
Hello,

Searching after "tsearch"
5. PostgreSQL: Documentation: Manuals: PostgreSQL 7.4: Examples [0.1]
...tsearch and tsearch2Full text
indexingPrevHomeNextLimitationsUpPage Files User Comments No comments
could be found for this...
http://www.postgresql.org/docs/7.4/interactive/examples.html

Searching after "tsearch2"
An error occured while searching.

Searching after "tsearch2full"
An error occured while searching.

Why is it so? =)

Cheers,
Henrik

19 dec 2006 kl. 00:47 skrev Joshua D. Drake:

> Hello,
>
> search.postgresql.org is now served directly from PostgreSQL 8.2 ,
> Tsearch2 and GIN. We have been testing thoroughly for the last
> couple of
> weeks but of course... it is now open to the general public.
>
> Take a look at let us know what you think and how it performs for you.
>
> Sincerely,
>
> Joshua D. Drake
>
> --
>
>       === The PostgreSQL Company: Command Prompt, Inc. ===
> Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
> Providing the most comprehensive  PostgreSQL solutions since 1997
>              http://www.commandprompt.com/
>
> Donate to the PostgreSQL Project: http://www.postgresql.org/about/
> donate
>
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq


Re: Let's play bash the search engine

From
Shane Ambler
Date:
Reece Hart wrote:
> On Mon, 2006-12-18 at 15:47 -0800, Joshua D. Drake wrote:
>
>> Take a look at let us know what you think and how it performs for you.
>
>
> Terrific. Fast and meaningful.
>
> I echo Thomas' request to have docs limited to a version (or, better,
> most recent). Perhaps archived docs should be searched via a separate
> page entirely.

+1 there - current docs should be searched unless you specify all/older
docs. Maybe old docs can be included in the archives section, which the
link isn't overly clear that it is the mail archives.


> Most the queries I did hit what I expected, except that the docs were
> for old versions. (In fact, I don't think 8.2 docs ever showed up
> first.)
>
> I tried "defer constraints" and got a few not-too-useful hits. However,
> "deferred constraints" returned meaningful links. Is that a stemmer
> problem?
>
> -Reece
>
I reckon the old search showed 3-4 lines in the 'preview' of the
listing, now we only get 1 which sometimes wraps partially onto 2.
Personally I preferred having 3-4 lines in the results - it can be
easier to pick out the page that you are searching for without going there.

One thing that I have seen on a few searches (eg. yahoo cached pages) is
when you follow a link it then highlights the search criteria on the
page. Would be a nice feature to quickly find the search result on the
destination page.



I found a little error in the page coding calculations -

Search for create
it states "Pages 1-20 of more than 1000." - that's ok
if you go to page 50 you get "Pages 981-1000 of more than 1000." - fine
then on page 51 you get "Your search for create returned no hits."

search for 'select' or 'update' gets the same thing. It would seem that
you have a 'limit 1000' which gives the 'more than 1000' in the hits
description but it generates an extra page (51) that tries to fetch
1001-1020


--

Shane Ambler
pgSQL@007Marketing.com

Get Sheeky @ http://Sheeky.Biz

Re: Let's play bash the search engine

From
"Gurjeet Singh"
Date:
On 12/19/06, Henrik Zagerholm <henke@mac.se> wrote:
Hello,

Searching after "tsearch"
5. PostgreSQL: Documentation: Manuals: PostgreSQL 7.4: Examples [0.1]
...tsearch and tsearch2Full text
indexingPrevHomeNextLimitationsUpPage Files User Comments No comments
could be found for this...
http://www.postgresql.org/docs/7.4/interactive/examples.html

Searching after "tsearch2"
An error occured while searching.

Searching after "tsearch2full"
An error occured while searching.

This error can be generalized to the reg-ex [::alpha::]+[::digit::]+
Examples:
A1
A2 etc...

Why is it so? =)

Cheers,
Henrik

19 dec 2006 kl. 00:47 skrev Joshua D. Drake:

> Hello,
>
> search.postgresql.org is now served directly from PostgreSQL 8.2 ,
> Tsearch2 and GIN. We have been testing thoroughly for the last
> couple of
> weeks but of course... it is now open to the general public.
>
> Take a look at let us know what you think and how it performs for you.
>
> Sincerely,
>
> Joshua D. Drake
>
> --
>
>       === The PostgreSQL Company: Command Prompt, Inc. ===
> Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
> Providing the most comprehensive  PostgreSQL solutions since 1997
>              http://www.commandprompt.com/
>
> Donate to the PostgreSQL Project: http://www.postgresql.org/about/
> donate
>
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                 http://www.postgresql.org/docs/faq


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org/



--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | yahoo }.com

Re: Let's play bash the search engine

From
"Gurjeet Singh"
Date:
On 12/19/06, Shane Ambler <pgsql@007marketing.com> wrote:
> I echo Thomas' request to have docs limited to a version (or, better,
> most recent). Perhaps archived docs should be searched via a separate
> page entirely.

+1 there - current docs should be searched unless you specify all/older

count me in too...

I reckon the old search showed 3-4 lines in the 'preview' of the
listing, now we only get 1 which sometimes wraps partially onto 2.
Personally I preferred having 3-4 lines in the results - it can be
easier to pick out the page that you are searching for without going there.

same sentiments

One thing that I have seen on a few searches (eg. yahoo cached pages) is
when you follow a link it then highlights the search criteria on the
page. Would be a nice feature to quickly find the search result on the
destination page.

+1

I found a little error in the page coding calculations -

Search for create
it states "Pages 1-20 of more than 1000." - that's ok
if you go to page 50 you get "Pages 981-1000 of more than 1000." - fine
then on page 51 you get "Your search for create returned no hits."

search for 'select' or 'update' gets the same thing. It would seem that
you have a 'limit 1000' which gives the 'more than 1000' in the hits
description but it generates an extra page (51) that tries to fetch
1001-1020

Or is it possible that the LIMIT ... OFFSET combination is erroneous!

my '2 cents (or die tryin)',

--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | yahoo }.com

Re: Let's play bash the search engine

From
Magnus Hagander
Date:
On Tue, Dec 19, 2006 at 12:56:25AM +0100, Thomas H. wrote:
> >Take a look at let us know what you think and how it performs for you.
>
> i would love an advanced search where you can limit the results to a
> particular version of the documentation. the query for "SELECT" returns too
> many results from too many versions, obviously.

You get this if you go into say the 8.2 docs, and use the search form
there - same as before.

That said, it's not a bad idea to add it anyway, but so far the main
concern has been feature-identical to what we had before.

//Magnus

Re: Let's play bash the search engine

From
Magnus Hagander
Date:
On Tue, Dec 19, 2006 at 01:48:22PM +0530, Gurjeet Singh wrote:
> On 12/19/06, Henrik Zagerholm <henke@mac.se> wrote:
> >
> >Hello,
> >
> >Searching after "tsearch"
> >5. PostgreSQL: Documentation: Manuals: PostgreSQL 7.4: Examples [0.1]
> >...tsearch and tsearch2Full text
> >indexingPrevHomeNextLimitationsUpPage Files User Comments No comments
> >could be found for this...
> >http://www.postgresql.org/docs/7.4/interactive/examples.html
> >
> >Searching after "tsearch2"
> >An error occured while searching.
> >
> >Searching after "tsearch2full"
> >An error occured while searching.
>
>
> This error can be generalized to the reg-ex [::alpha::]+[::digit::]+
> Examples:
> A1
> A2 etc...
>
> Why is it so? =)

Seems to_tsvecto() returns NULL for tsearch2 or for, as you say,
anything that ends in a digit.

Oleg, can you comment on why this is happening? What can we do to fix
that?

//Magnus

Re: Let's play bash the search engine

From
Oleg Bartunov
Date:
On Tue, 19 Dec 2006, Magnus Hagander wrote:

> On Tue, Dec 19, 2006 at 01:48:22PM +0530, Gurjeet Singh wrote:
>> On 12/19/06, Henrik Zagerholm <henke@mac.se> wrote:
>>>
>>> Hello,
>>>
>>> Searching after "tsearch"
>>> 5. PostgreSQL: Documentation: Manuals: PostgreSQL 7.4: Examples [0.1]
>>> ...tsearch and tsearch2Full text
>>> indexingPrevHomeNextLimitationsUpPage Files User Comments No comments
>>> could be found for this...
>>> http://www.postgresql.org/docs/7.4/interactive/examples.html
>>>
>>> Searching after "tsearch2"
>>> An error occured while searching.
>>>
>>> Searching after "tsearch2full"
>>> An error occured while searching.
>>
>>
>> This error can be generalized to the reg-ex [::alpha::]+[::digit::]+
>> Examples:
>> A1
>> A2 etc...
>>
>> Why is it so? =)
>
> Seems to_tsvecto() returns NULL for tsearch2 or for, as you say,
> anything that ends in a digit.
>
> Oleg, can you comment on why this is happening? What can we do to fix
> that?

Most probably, token type 'word' just doesn't indexed. If you
didnt' correct this from pgweb configuration:

-- we won't index/search some tokens
update pg_ts_cfgmap set dict_name = NULL
where tok_alias in ('email', 'url', 'sfloat', 'uri', 'float','word')
and ts_name = 'pg';





>
> //Magnus
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Don't split on underscore

From
Hannes Dorbath
Date:
I think it would be useful to adjust the parser to not split on underscores:

In case I'd like to lookup PG's to_number() function I won't get
anything useful. Number is contained nearly everywhere and to is
configured as stop word. Same with most other functions.


On 19.12.2006 00:47, Joshua D. Drake wrote:
> search.postgresql.org is now served directly from PostgreSQL 8.2 ,
> Tsearch2 and GIN. We have been testing thoroughly for the last couple of
> weeks but of course... it is now open to the general public.
>
> Take a look at let us know what you think and how it performs for you.


--
Regards,
Hannes Dorbath

Re: Let's play bash the search engine

From
Magnus Hagander
Date:
On Tue, Dec 19, 2006 at 01:13:16PM +0300, Oleg Bartunov wrote:
> >Seems to_tsvecto() returns NULL for tsearch2 or for, as you say,
> >anything that ends in a digit.
> >
> >Oleg, can you comment on why this is happening? What can we do to fix
> >that?
>
> Most probably, token type 'word' just doesn't indexed. If you
> didnt' correct this from pgweb configuration:
>
> -- we won't index/search some tokens
> update pg_ts_cfgmap set dict_name = NULL
> where tok_alias in ('email', 'url', 'sfloat', 'uri', 'float','word')
> and ts_name = 'pg';

That sounds like it's the problem. I'll update the configuration, and I
assume I have to regenerate all the tsvectors as well, right?

Should I set it to 'simple' or one of the others?

//Magnus

Re: Let's play bash the search engine

From
Magnus Hagander
Date:
On Tue, Dec 19, 2006 at 01:24:01PM +0100, Magnus Hagander wrote:
> On Tue, Dec 19, 2006 at 01:13:16PM +0300, Oleg Bartunov wrote:
> > >Seems to_tsvecto() returns NULL for tsearch2 or for, as you say,
> > >anything that ends in a digit.
> > >
> > >Oleg, can you comment on why this is happening? What can we do to fix
> > >that?
> >
> > Most probably, token type 'word' just doesn't indexed. If you
> > didnt' correct this from pgweb configuration:
> >
> > -- we won't index/search some tokens
> > update pg_ts_cfgmap set dict_name = NULL
> > where tok_alias in ('email', 'url', 'sfloat', 'uri', 'float','word')
> > and ts_name = 'pg';
>
> That sounds like it's the problem. I'll update the configuration, and I
> assume I have to regenerate all the tsvectors as well, right?
>
> Should I set it to 'simple' or one of the others?

This has now been fixed for both website and archive search. So now you
can search for the technology that made the search possible in the first
place again :-)

//Magnus

Re: Let's play bash the search engine

From
Lincoln Yeoh
Date:
Hi,

Seems ok. Works better than most corporate search engines - some tend
to show pages and pages of useless press releases when you are
searching for drivers, specifications etc.

But as long as the sites remain indexable to outside search engines,
people get to use whichever search engine they prefer.

For example: Google works fine with: site:postgresql.org

It also does phrase searches, pdfs (you can even do filetype: inurl:
and other stuff[1]).

Have fun!

Link.

[1] For example: filetype:pdf "company confidential"
or filetype:xls confidential price

At 07:47 AM 12/19/2006, Joshua D. Drake wrote:
>Hello,
>
>search.postgresql.org is now served directly from PostgreSQL 8.2 ,
>Tsearch2 and GIN. We have been testing thoroughly for the last couple of
>weeks but of course... it is now open to the general public.
>
>Take a look at let us know what you think and how it performs for you.
>
>Sincerely,
>
>Joshua D. Drake
>
>--
>
>       === The PostgreSQL Company: Command Prompt, Inc. ===
>Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
>Providing the most comprehensive  PostgreSQL solutions since 1997
>              http://www.commandprompt.com/
>
>Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



Re: Let's play bash the search engine

From
Shane Ambler
Date:
Magnus Hagander wrote:
> On Tue, Dec 19, 2006 at 12:56:25AM +0100, Thomas H. wrote:
>>> Take a look at let us know what you think and how it performs for you.
>> i would love an advanced search where you can limit the results to a
>> particular version of the documentation. the query for "SELECT" returns too
>> many results from too many versions, obviously.
>
> You get this if you go into say the 8.2 docs, and use the search form
> there - same as before.

I would search from the home page rather than navigate to docs first.
(open browser - type postgres.org (enter) - tab to search field - type
whattofind (enter)

> That said, it's not a bad idea to add it anyway, but so far the main
> concern has been feature-identical to what we had before.

It would seem to be one of those 'accept it as it is' when you get here,
and now that you ask us to look we say 'but why?' ;-)


--

Shane Ambler
pgSQL@007Marketing.com

Get Sheeky @ http://Sheeky.Biz

Re: Let's play bash the search engine

From
Magnus Hagander
Date:
On Wed, Dec 20, 2006 at 01:35:57AM +1030, Shane Ambler wrote:
> Magnus Hagander wrote:
> >You get this if you go into say the 8.2 docs, and use the search form
> >there - same as before.
>
> I would search from the home page rather than navigate to docs first.
> (open browser - type postgres.org (enter) - tab to search field - type
> whattofind (enter)

Yeah, I can see how that's a common usage pattern.

> >That said, it's not a bad idea to add it anyway, but so far the main
> >concern has been feature-identical to what we had before.
>
> It would seem to be one of those 'accept it as it is' when you get here,
> and now that you ask us to look we say 'but why?' ;-)

Heh, it was actually Josh that asked you to look ;)

But seriously, I'm definitly interested in ways it can be improved - and
that's true of the whole web team, I'm sure. It was just my way of
saying "it will take a while", but I'll file it away as a good thing to
do when there is a moment of spare time.

//Magnus

Re: Let's play bash the search engine

From
Matthew O'Connor
Date:
Magnus Hagander wrote:
> But seriously, I'm definitly interested in ways it can be improved - and
> that's true of the whole web team, I'm sure. It was just my way of
> saying "it will take a while", but I'll file it away as a good thing to
> do when there is a moment of spare time.

I like the way the php.net homepage has a search box on the homepage
with a dropdown next to it to specify what to search.


Re: Let's play bash the search engine

From
Alvaro Herrera
Date:
Matthew O'Connor wrote:
> Magnus Hagander wrote:
> >But seriously, I'm definitly interested in ways it can be improved - and
> >that's true of the whole web team, I'm sure. It was just my way of
> >saying "it will take a while", but I'll file it away as a good thing to
> >do when there is a moment of spare time.
>
> I like the way the php.net homepage has a search box on the homepage
> with a dropdown next to it to specify what to search.

Yeah, that would be very appropriate, allowing you to search specific
version of the docs.  Heck, if it allowed searching of specific mail
lists, that would rock.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Let's play bash the search engine

From
"Gurjeet Singh"
Date:
On 12/19/06, Matthew O'Connor <matthew@zeut.net> wrote:
Magnus Hagander wrote:
> But seriously, I'm definitly interested in ways it can be improved - and
> that's true of the whole web team, I'm sure. It was just my way of
> saying "it will take a while", but I'll file it away as a good thing to
> do when there is a moment of spare time.

I like the way the php.net homepage has a search box on the homepage
with a dropdown next to it to specify what to search.

I would recommend a set of check-boxes, so that user can select multiple places to search. Eg. search in 8.2 release, 8.0 release, ans as just suggested by Alvaro, pgsql-hackers mailing list also.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org/



--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | yahoo }.com

Re: Let's play bash the search engine

From
Magnus Hagander
Date:
On Tue, Dec 19, 2006 at 09:41:38PM +0530, Gurjeet Singh wrote:
> On 12/19/06, Matthew O'Connor <matthew@zeut.net> wrote:
> >
> >Magnus Hagander wrote:
> >> But seriously, I'm definitly interested in ways it can be improved - and
> >> that's true of the whole web team, I'm sure. It was just my way of
> >> saying "it will take a while", but I'll file it away as a good thing to
> >> do when there is a moment of spare time.
> >
> >I like the way the php.net homepage has a search box on the homepage
> >with a dropdown next to it to specify what to search.
>
>
> I would recommend a set of check-boxes, so that user can select multiple
> places to search. Eg. search in 8.2 release, 8.0 release, ans as just
> suggested by Alvaro, pgsql-hackers mailing list also.

You seriously want this in the header of every webpage?!

We still have a searchform at search.postgresql.org. To search
mainliglists, click Archives search. You can select a list, or a group
of list, there. Or just search them all.

Now, the generic website search could be expanded to let you search
different parts such as the docs, but we cannot possibly do that on the
searchbox in the header.

//Magnus

Re: Let's play bash the search engine

From
Oleg Bartunov
Date:
On Tue, 19 Dec 2006, Alvaro Herrera wrote:

> Matthew O'Connor wrote:
>> Magnus Hagander wrote:
>>> But seriously, I'm definitly interested in ways it can be improved - and
>>> that's true of the whole web team, I'm sure. It was just my way of
>>> saying "it will take a while", but I'll file it away as a good thing to
>>> do when there is a moment of spare time.
>>
>> I like the way the php.net homepage has a search box on the homepage
>> with a dropdown next to it to specify what to search.
>
> Yeah, that would be very appropriate, allowing you to search specific
> version of the docs.  Heck, if it allowed searching of specific mail
> lists, that would rock.

It should be pretty easy once documents has apropriate metadata.
Also, displaying current section in search box would be informative,
so when you're in current documentation, pull-down menu should display
<8.2 Documentation>.

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: Let's play bash the search engine

From
Magnus Hagander
Date:
On Tue, Dec 19, 2006 at 01:02:09PM -0300, Alvaro Herrera wrote:
> Matthew O'Connor wrote:
> > Magnus Hagander wrote:
> > >But seriously, I'm definitly interested in ways it can be improved - and
> > >that's true of the whole web team, I'm sure. It was just my way of
> > >saying "it will take a while", but I'll file it away as a good thing to
> > >do when there is a moment of spare time.
> >
> > I like the way the php.net homepage has a search box on the homepage
> > with a dropdown next to it to specify what to search.
>
> Yeah, that would be very appropriate, allowing you to search specific
> version of the docs.  Heck, if it allowed searching of specific mail
> lists, that would rock.

Definitly worth looking at what we can do without breaking the layout
there.

As for searching spceific mailinglists, you can do that from the
archives search page already. Perhaps it needs to be made more
accessible or something?

//Magnus

Re: Let's play bash the search engine

From
Oleg Bartunov
Date:
On Tue, 19 Dec 2006, Gurjeet Singh wrote:

> On 12/19/06, Matthew O'Connor <matthew@zeut.net> wrote:
>>
>> Magnus Hagander wrote:
>> > But seriously, I'm definitly interested in ways it can be improved - and
>> > that's true of the whole web team, I'm sure. It was just my way of
>> > saying "it will take a while", but I'll file it away as a good thing to
>> > do when there is a moment of spare time.
>>
>> I like the way the php.net homepage has a search box on the homepage
>> with a dropdown next to it to specify what to search.
>
>
> I would recommend a set of check-boxes, so that user can select multiple
> places to search. Eg. search in 8.2 release, 8.0 release, ans as just
> suggested by Alvaro, pgsql-hackers mailing list also.

Too many check-boxes :) Better to have pull-down menu with multiple selections.


     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: Let's play bash the search engine

From
Matthew O'Connor
Date:
Magnus Hagander wrote:
> We still have a searchform at search.postgresql.org. To search
> mainliglists, click Archives search. You can select a list, or a group
> of list, there. Or just search them all.

Anyone see any value in allowing the mailing list search to be sorted by
both ascending and descending date?

Re: Let's play bash the search engine

From
"Gurjeet Singh"
Date:


On 12/19/06, Oleg Bartunov <oleg@sai.msu.su> wrote:
On Tue, 19 Dec 2006, Gurjeet Singh wrote:

> On 12/19/06, Matthew O'Connor <matthew@zeut.net> wrote:
>>
>> Magnus Hagander wrote:
>> > But seriously, I'm definitly interested in ways it can be improved - and
>> > that's true of the whole web team, I'm sure. It was just my way of
>> > saying "it will take a while", but I'll file it away as a good thing to
>> > do when there is a moment of spare time.
>>
>> I like the way the php.net homepage has a search box on the homepage
>> with a dropdown next to it to specify what to search.
>
>
> I would recommend a set of check-boxes, so that user can select multiple
> places to search. Eg. search in 8.2 release, 8.0 release, ans as just
> suggested by Alvaro, pgsql-hackers mailing list also.

Too many check-boxes :)

In the hindsight, does sound like a  bad idea.

Better to have pull-down menu with multiple selections.

Is it possible? Havn't had that much experiance with HTML, but sounds good if it is possible.

Or, can we introduce a hide-able (default hidden) div element that contains all these check-boxes (Javascript). I know we don't use JS right now, but just an idea.


Regards

--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | yahoo }.com

Re: Let's play bash the search engine

From
"Joshua D. Drake"
Date:
On Tue, 2006-12-19 at 17:15 +0100, Magnus Hagander wrote:
> On Tue, Dec 19, 2006 at 09:41:38PM +0530, Gurjeet Singh wrote:
> > On 12/19/06, Matthew O'Connor <matthew@zeut.net> wrote:
> > >
> > >Magnus Hagander wrote:
> > >> But seriously, I'm definitly interested in ways it can be improved - and
> > >> that's true of the whole web team, I'm sure. It was just my way of
> > >> saying "it will take a while", but I'll file it away as a good thing to
> > >> do when there is a moment of spare time.
> > >
> > >I like the way the php.net homepage has a search box on the homepage
> > >with a dropdown next to it to specify what to search.
> >
> >
> > I would recommend a set of check-boxes, so that user can select multiple
> > places to search. Eg. search in 8.2 release, 8.0 release, ans as just
> > suggested by Alvaro, pgsql-hackers mailing list also.
>
> You seriously want this in the header of every webpage?!

Checkboxes IMO are a no-op. However a drop down box is probably
reasonable.

J


>
> We still have a searchform at search.postgresql.org. To search
> mainliglists, click Archives search. You can select a list, or a group
> of list, there. Or just search them all.
>
> Now, the generic website search could be expanded to let you search
> different parts such as the docs, but we cannot possibly do that on the
> searchbox in the header.
>
> //Magnus
>
--

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
             http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




Re: Let's play bash the search engine

From
"Filip Rembiałkowski"
Date:
2006/12/19, Joshua D. Drake <jd@commandprompt.com>:
> Take a look at let us know what you think and how it performs for you.

http://search.postgresql.org/search?q=HAVING
says  "An error occured while searching."


F.

Re: Let's play bash the search engine

From
Oleg Bartunov
Date:
On Tue, 19 Dec 2006, Filip Rembiakowski wrote:

> 2006/12/19, Joshua D. Drake <jd@commandprompt.com>:
>> Take a look at let us know what you think and how it performs for you.
>
> http://search.postgresql.org/search?q=HAVING
> says  "An error occured while searching."

I bet HAVING is a stop-word, so actual message is
'NOTICE:  query contains only stopword(s) or doesn't contain lexeme(s), ignored'

I think we should add to pg_dict dictionary line

having having

This will prevent 'having' be recognized as a stop-word by other dictionaries,
which follow pg_dict dictionary.

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: Let's play bash the search engine

From
Steve Atkins
Date:
On Dec 19, 2006, at 8:15 AM, Magnus Hagander wrote:

> On Tue, Dec 19, 2006 at 01:02:09PM -0300, Alvaro Herrera wrote:
>> Matthew O'Connor wrote:
>>> Magnus Hagander wrote:
>>>> But seriously, I'm definitly interested in ways it can be
>>>> improved - and
>>>> that's true of the whole web team, I'm sure. It was just my way of
>>>> saying "it will take a while", but I'll file it away as a good
>>>> thing to
>>>> do when there is a moment of spare time.
>>>
>>> I like the way the php.net homepage has a search box on the homepage
>>> with a dropdown next to it to specify what to search.
>>
>> Yeah, that would be very appropriate, allowing you to search specific
>> version of the docs.  Heck, if it allowed searching of specific mail
>> lists, that would rock.
>
> Definitly worth looking at what we can do without breaking the layout
> there.

Would a link to search.postgresql.org by the search box at the top
of each page be enough?

Cheers,
   Steve



Re: Let's play bash the search engine

From
Alvaro Herrera
Date:
Magnus Hagander wrote:

> As for searching spceific mailinglists, you can do that from the
> archives search page already. Perhaps it needs to be made more
> accessible or something?

Amazing.  What I like the most about it is that it actually works.
Great!

I think it would be good to make it more prominent.  Maybe have all the
search forms integrated on a single page and put a link to it in the top
menu, next to Support.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Let's play bash the search engine

From
"Thomas H."
Date:
>> http://search.postgresql.org/search?q=HAVING
>> says  "An error occured while searching."
>
> I bet HAVING is a stop-word, so actual message is 'NOTICE:  query contains
> only stopword(s) or doesn't contain lexeme(s), ignored'
>
> I think we should add to pg_dict dictionary line
>
> having having

just a though... wouldn't it make sense for a documentation search index to
*not* have stop words at all? potentially every word that is being searched
for could be contained in a query example, code piece etc and thus seems
important to me... for example keywords like AND, OR etc.

- thomas



Re: Let's play bash the search engine

From
Oleg Bartunov
Date:
On Tue, 19 Dec 2006, Thomas H. wrote:

>>> http://search.postgresql.org/search?q=HAVING
>>> says  "An error occured while searching."
>>
>> I bet HAVING is a stop-word, so actual message is 'NOTICE:  query contains
>> only stopword(s) or doesn't contain lexeme(s), ignored'
>>
>> I think we should add to pg_dict dictionary line
>>
>> having having
>
> just a though... wouldn't it make sense for a documentation search index to
> *not* have stop words at all? potentially every word that is being searched
> for could be contained in a query example, code piece etc and thus seems
> important to me... for example keywords like AND, OR etc.

Ah, I forgot about them. Now, with GiN we could definitely try
stop-words free search !

>
> - thomas
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: (bash the search engine) Don't split on underscore

From
Reece Hart
Date:
On Tue, 2006-12-19 at 13:25 +0100, Hannes Dorbath wrote:
I think it would be useful to adjust the parser to not split on underscores:

In case I'd like to lookup PG's to_number() function I won't get 
anything useful. Number is contained nearly everywhere and to is 
configured as stop word. Same with most other functions.

That would be useful and almost certainly result in better specificity.  A counter example is searching for "information schema", for which you'd probably want hits to "information_schema" as well.

-Reece

-- 
Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush
kernel warning: universe consuming too many resources. Killing.
universe killed due to catastrophic leadership. Try -uspres carter.

Re: Let's play bash the search engine

From
"Thomas H."
Date:
> I think it would be good to make it more prominent.  Maybe have all the
> search forms integrated on a single page and put a link to it in the top
> menu, next to Support.

well, why not add a dropdown (or even better a multi-select input) on the
search page where users can choose what to search in:

All (default)
Documentation
\_ Most Recent
\_ 8.2
\_ 8.1
\_...
Mailing Lists
\_ Beginners
\_ General
\_ Hackers
\_ ODBC
\_ ...

and so on...

or one could provide options to narrow search results by specifying
parameters to the search, for example "select query url:documentation/8.2"
would return only results whose paths contain the provided url parameter and
whose pages contain the words "select" and "query"...

- thomas



Re: (bash the search engine) Don't split on underscore

From
Hannes Dorbath
Date:
On 19.12.2006 20:32, Reece Hart wrote:
> On Tue, 2006-12-19 at 13:25 +0100, Hannes Dorbath wrote:
> A counter example is searching for "information schema", for which you'd
> probably want hits to "information_schema" as well.

I think `information_schema' should be indexed as:

   - information
   - schema
   - information_schema

There is nothing wrong with indexing both.

--
Regards,
Hannes Dorbath

Re: Let's play bash the search engine

From
magnus@hagander.net
Date:
> > As for searching spceific mailinglists, you can do that from the
> > archives search page already. Perhaps it needs to be made more
> > accessible or something?
>
> Amazing.  What I like the most about it is that it actually works.
> Great!

that actually worked on the old one as well. The only new thing there is that you can now select a group (Eg dev lists)
andit will search in all lists in that group.  

> I think it would be good to make it more prominent.  Maybe have all the
> search forms integrated on a single page and put a link to it in the top
> menu, next to Support.

Maybe an 'advanced' link next to it? There should be room. Not likely there is room for both that and a dropdown of
sitesections though, so which is most important? 

/magnus