Thread: PostgreSQL idocs

PostgreSQL idocs

From
Francois Suter
Date:
Hi all,

I don't know who is responsible for the PostgreSQL's interactive
documentation so I'm writing to the general ML.

As you probably all noticed, the idocs are pretty limited because it is not
possible to search several words at the same time (either with AND or OR),
as explained in the search help.

Since I recently had to develop something similar (in PHP), here is the code
(attached file), if it can be useful and make the idocs better. It's maybe
not the most efficient algorithm in the world, because it's the first
version and I didn't yet have time to optimize it, but it does the job.

Usage is as follows:

- assume the search text field is called "search"
- assume the text field in the database is called "text"

you can write something like that:

$query = "SELECT * FROM table_name WHERE
".searchString("search","text",true,false);

The third parameter is true for a case-insensitive search (using ILIKE). The
fourth parameters forces all words in the search string to be considered
with "AND".

The way it works is that it takes the search string and tokenizes it around
blanks and + signs. Blanks are interpreted as ORs and + as ANDs. Hence for a
search like "word1+word2 word3", it will return the following string:

"text" ILIKE '%word1%' AND "text" ILIKE '%word2%' OR "text" ILIKE '%word3%'

Pretty rough, but it works. Hope this can help somehow.

--------
Francois

Home page: http://www.monpetitcoin.com/
"We waste our time spending money we don't have to buy things we don't need
to impress people we don't like"


Attachment

Re: PostgreSQL idocs

From
Oleg Bartunov
Date:
Why not just use OpenFTS for full text searching in idocs or just tsearch ?
It should works for such collection perfectly !

    Oleg
On Thu, 12 Dec 2002, Francois Suter wrote:

> Hi all,
>
> I don't know who is responsible for the PostgreSQL's interactive
> documentation so I'm writing to the general ML.
>
> As you probably all noticed, the idocs are pretty limited because it is not
> possible to search several words at the same time (either with AND or OR),
> as explained in the search help.
>
> Since I recently had to develop something similar (in PHP), here is the code
> (attached file), if it can be useful and make the idocs better. It's maybe
> not the most efficient algorithm in the world, because it's the first
> version and I didn't yet have time to optimize it, but it does the job.
>
> Usage is as follows:
>
> - assume the search text field is called "search"
> - assume the text field in the database is called "text"
>
> you can write something like that:
>
> $query = "SELECT * FROM table_name WHERE
> ".searchString("search","text",true,false);
>
> The third parameter is true for a case-insensitive search (using ILIKE). The
> fourth parameters forces all words in the search string to be considered
> with "AND".
>
> The way it works is that it takes the search string and tokenizes it around
> blanks and + signs. Blanks are interpreted as ORs and + as ANDs. Hence for a
> search like "word1+word2 word3", it will return the following string:
>
> "text" ILIKE '%word1%' AND "text" ILIKE '%word2%' OR "text" ILIKE '%word3%'
>
> Pretty rough, but it works. Hope this can help somehow.
>
> --------
> Francois
>
> Home page: http://www.monpetitcoin.com/
> "We waste our time spending money we don't have to buy things we don't need
> to impress people we don't like"
>
>

    Regards,
        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


Re: PostgreSQL idocs

From
Francois Suter
Date:
> Why not just use OpenFTS for full text searching in idocs or just tsearch ?
> It should works for such collection perfectly !

I agree that there's bound to be some open source search engine that could
be used (and that would produce much better results than my little
function), but since it hasn't been done, there's probably a reason, but
which?

--------
Francois

Home page: http://www.monpetitcoin.com/
"We waste our time spending money we don't have to buy things we don't need
to impress people we don't like"


Re: PostgreSQL idocs

From
"scott.marlowe"
Date:
On Thu, 12 Dec 2002, Francois Suter wrote:

> Hi all,
>
> I don't know who is responsible for the PostgreSQL's interactive
> documentation so I'm writing to the general ML.
>
> As you probably all noticed, the idocs are pretty limited because it is not
> possible to search several words at the same time (either with AND or OR),
> as explained in the search help.
>
> Since I recently had to develop something similar (in PHP), here is the code
> (attached file), if it can be useful and make the idocs better. It's maybe
> not the most efficient algorithm in the world, because it's the first
> version and I didn't yet have time to optimize it, but it does the job.
>
> Usage is as follows:
>
> - assume the search text field is called "search"
> - assume the text field in the database is called "text"
>
> you can write something like that:
>
> $query = "SELECT * FROM table_name WHERE
> ".searchString("search","text",true,false);
>
> The third parameter is true for a case-insensitive search (using ILIKE). The
> fourth parameters forces all words in the search string to be considered
> with "AND".
>
> The way it works is that it takes the search string and tokenizes it around
> blanks and + signs. Blanks are interpreted as ORs and + as ANDs. Hence for a
> search like "word1+word2 word3", it will return the following string:
>
> "text" ILIKE '%word1%' AND "text" ILIKE '%word2%' OR "text" ILIKE '%word3%'
>
> Pretty rough, but it works. Hope this can help somehow.

Actually, has anyone looked at throwing htdig on top of the idocs site?
The newest betas are pretty stable (we are using them in production at
work) and allow phrase searching, and can do either an ALL, AND, or
Boolean type of search.  It includes soundex fuzzy matching, as well as
many other features (i.e. expands work to works, working, worked, etc...)

It's a great search engine and VERY fast.  I love postgresql, but for this
kind of thing, I prefer htdig.


Re: PostgreSQL idocs

From
Bret Hughes
Date:
Sorry to Scott for the reply to his mail only. I wish evolution would
reply to the list automatically instead  of the original sender.  Is
there a historical reason these lists do not use a
Reply-To:pgsql-whatever@postgresql.org in the headers?

On Thu, 2002-12-12 at 10:15, scott.marlowe wrote:
> On Thu, 12 Dec 2002, Francois Suter wrote:
>
> > Hi all,
> >
> > I don't know who is responsible for the PostgreSQL's interactive
> > documentation so I'm writing to the general ML.

While we are on the subject of docs, I will show my ignorance and ask
how do I convert the sgml doc provided in the redhat rpms to html for
local reading?  then I could search my own doc tree.

I don't know (*&%# about sgml.  I tried sqml2html but it barfed telling
me that it only understood linuxdoc format. I have to admit I only spent
about 30 minutes looking for the solution.

Bret


Re: PostgreSQL idocs

From
Robert Treat
Date:
Just use reply-to-all from evolution. Historically there have been times
when posting via the mailing lists have been slow, so direct emails have
become a defacto standard for these lists, with the list being cc'd to
archive things.

Robert Treat

On Thu, 2002-12-12 at 13:33, Bret Hughes wrote:
> Sorry to Scott for the reply to his mail only. I wish evolution would
> reply to the list automatically instead  of the original sender.  Is
> there a historical reason these lists do not use a
> Reply-To:pgsql-whatever@postgresql.org in the headers?
>
> On Thu, 2002-12-12 at 10:15, scott.marlowe wrote:
> > On Thu, 12 Dec 2002, Francois Suter wrote:
> >
> > > Hi all,
> > >
> > > I don't know who is responsible for the PostgreSQL's interactive
> > > documentation so I'm writing to the general ML.
>
> While we are on the subject of docs, I will show my ignorance and ask
> how do I convert the sgml doc provided in the redhat rpms to html for
> local reading?  then I could search my own doc tree.
>
> I don't know (*&%# about sgml.  I tried sqml2html but it barfed telling
> me that it only understood linuxdoc format. I have to admit I only spent
> about 30 minutes looking for the solution.
>
> Bret
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)




Re: PostgreSQL idocs

From
Medi Montaseri
Date:
Yes I'm also in favor of grabbing one of those 'my search engine'
offerings by google and
other vendor/providers...too late to develope yet another search engine...

Oleg Bartunov wrote:

>Why not just use OpenFTS for full text searching in idocs or just tsearch ?
>It should works for such collection perfectly !
>
>    Oleg
>On Thu, 12 Dec 2002, Francois Suter wrote:
>
>
>
>>Hi all,
>>
>>I don't know who is responsible for the PostgreSQL's interactive
>>documentation so I'm writing to the general ML.
>>
>>As you probably all noticed, the idocs are pretty limited because it is not
>>possible to search several words at the same time (either with AND or OR),
>>as explained in the search help.
>>
>>Since I recently had to develop something similar (in PHP), here is the code
>>(attached file), if it can be useful and make the idocs better. It's maybe
>>not the most efficient algorithm in the world, because it's the first
>>version and I didn't yet have time to optimize it, but it does the job.
>>
>>Usage is as follows:
>>
>>- assume the search text field is called "search"
>>- assume the text field in the database is called "text"
>>
>>you can write something like that:
>>
>>$query = "SELECT * FROM table_name WHERE
>>".searchString("search","text",true,false);
>>
>>The third parameter is true for a case-insensitive search (using ILIKE). The
>>fourth parameters forces all words in the search string to be considered
>>with "AND".
>>
>>The way it works is that it takes the search string and tokenizes it around
>>blanks and + signs. Blanks are interpreted as ORs and + as ANDs. Hence for a
>>search like "word1+word2 word3", it will return the following string:
>>
>>"text" ILIKE '%word1%' AND "text" ILIKE '%word2%' OR "text" ILIKE '%word3%'
>>
>>Pretty rough, but it works. Hope this can help somehow.
>>
>>--------
>>Francois
>>
>>Home page: http://www.monpetitcoin.com/
>>"We waste our time spending money we don't have to buy things we don't need
>>to impress people we don't like"
>>
>>
>>
>>
>
>    Regards,
>        Oleg
>_____________________________________________________________
>Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>Sternberg Astronomical Institute, Moscow University (Russia)
>Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>phone: +007(095)939-16-83, +007(095)939-23-83
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 4: Don't 'kill -9' the postmaster
>
>




Re: PostgreSQL idocs

From
Bret Hughes
Date:
On Thu, 2002-12-12 at 12:54, Robert Treat wrote:
> Just use reply-to-all from evolution. Historically there have been times
> when posting via the mailing lists have been slow, so direct emails have
> become a defacto standard for these lists, with the list being cc'd to
> archive things.
>
> Robert Treat
>


Thanks Robert.


Re: PostgreSQL idocs

From
Lincoln Yeoh
Date:
There's also google ;).

Example search:
site:postgresql.org  "interactive documentation" tuning wal -optimizer

It's convenient that the phrase "interactive documentation" is on every
idoc page ;). Leave that phrase out if you want to search the whole site.

You can also read the docs using google's cache - e.g. if idocs goes down
or you want to save postgresql.org's bandwidth and don't mind slightly
older docs.

<grin type=mischievous >
It's a great search engine and very fast, etc...
;)
</grin>

Yah I know, shouldn't take google for granted, it's good for a site to have
it's own search engine. Unfortunately, I've found that Google's search
works better than most sites own search engines. Either 0 documents found,
or 11523 documents found with useless marketing/PR junk topmost.

Still most crawlers don't help for private sites with access controls -
e.g. search results different depending on your access (can't have secret
docs turning in everyone's searches :) ).

Cheerio,
Link.

At 09:15 AM 12/12/02 -0700, scott.marlowe wrote:


>Actually, has anyone looked at throwing htdig on top of the idocs site?
>The newest betas are pretty stable (we are using them in production at
>work) and allow phrase searching, and can do either an ALL, AND, or
>Boolean type of search.  It includes soundex fuzzy matching, as well as
>many other features (i.e. expands work to works, working, worked, etc...)
>
>It's a great search engine and VERY fast.  I love postgresql, but for this
>kind of thing, I prefer htdig.



Re: PostgreSQL idocs

From
Manfred Koizar
Date:
On Fri, 13 Dec 2002 04:36:32 +0800, Lincoln Yeoh <lyeoh@pop.jaring.my>
wrote:
>You can also read the docs using google's cache

... with the added benefit that your search words are highlighted :-)

Servus
 Manfred