Thread: PostgreSQL idocs
Hi all, I don't know who is responsible for the PostgreSQL's interactive documentation so I'm writing to the general ML. As you probably all noticed, the idocs are pretty limited because it is not possible to search several words at the same time (either with AND or OR), as explained in the search help. Since I recently had to develop something similar (in PHP), here is the code (attached file), if it can be useful and make the idocs better. It's maybe not the most efficient algorithm in the world, because it's the first version and I didn't yet have time to optimize it, but it does the job. Usage is as follows: - assume the search text field is called "search" - assume the text field in the database is called "text" you can write something like that: $query = "SELECT * FROM table_name WHERE ".searchString("search","text",true,false); The third parameter is true for a case-insensitive search (using ILIKE). The fourth parameters forces all words in the search string to be considered with "AND". The way it works is that it takes the search string and tokenizes it around blanks and + signs. Blanks are interpreted as ORs and + as ANDs. Hence for a search like "word1+word2 word3", it will return the following string: "text" ILIKE '%word1%' AND "text" ILIKE '%word2%' OR "text" ILIKE '%word3%' Pretty rough, but it works. Hope this can help somehow. -------- Francois Home page: http://www.monpetitcoin.com/ "We waste our time spending money we don't have to buy things we don't need to impress people we don't like"
Attachment
Why not just use OpenFTS for full text searching in idocs or just tsearch ? It should works for such collection perfectly ! Oleg On Thu, 12 Dec 2002, Francois Suter wrote: > Hi all, > > I don't know who is responsible for the PostgreSQL's interactive > documentation so I'm writing to the general ML. > > As you probably all noticed, the idocs are pretty limited because it is not > possible to search several words at the same time (either with AND or OR), > as explained in the search help. > > Since I recently had to develop something similar (in PHP), here is the code > (attached file), if it can be useful and make the idocs better. It's maybe > not the most efficient algorithm in the world, because it's the first > version and I didn't yet have time to optimize it, but it does the job. > > Usage is as follows: > > - assume the search text field is called "search" > - assume the text field in the database is called "text" > > you can write something like that: > > $query = "SELECT * FROM table_name WHERE > ".searchString("search","text",true,false); > > The third parameter is true for a case-insensitive search (using ILIKE). The > fourth parameters forces all words in the search string to be considered > with "AND". > > The way it works is that it takes the search string and tokenizes it around > blanks and + signs. Blanks are interpreted as ORs and + as ANDs. Hence for a > search like "word1+word2 word3", it will return the following string: > > "text" ILIKE '%word1%' AND "text" ILIKE '%word2%' OR "text" ILIKE '%word3%' > > Pretty rough, but it works. Hope this can help somehow. > > -------- > Francois > > Home page: http://www.monpetitcoin.com/ > "We waste our time spending money we don't have to buy things we don't need > to impress people we don't like" > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
> Why not just use OpenFTS for full text searching in idocs or just tsearch ? > It should works for such collection perfectly ! I agree that there's bound to be some open source search engine that could be used (and that would produce much better results than my little function), but since it hasn't been done, there's probably a reason, but which? -------- Francois Home page: http://www.monpetitcoin.com/ "We waste our time spending money we don't have to buy things we don't need to impress people we don't like"
On Thu, 12 Dec 2002, Francois Suter wrote: > Hi all, > > I don't know who is responsible for the PostgreSQL's interactive > documentation so I'm writing to the general ML. > > As you probably all noticed, the idocs are pretty limited because it is not > possible to search several words at the same time (either with AND or OR), > as explained in the search help. > > Since I recently had to develop something similar (in PHP), here is the code > (attached file), if it can be useful and make the idocs better. It's maybe > not the most efficient algorithm in the world, because it's the first > version and I didn't yet have time to optimize it, but it does the job. > > Usage is as follows: > > - assume the search text field is called "search" > - assume the text field in the database is called "text" > > you can write something like that: > > $query = "SELECT * FROM table_name WHERE > ".searchString("search","text",true,false); > > The third parameter is true for a case-insensitive search (using ILIKE). The > fourth parameters forces all words in the search string to be considered > with "AND". > > The way it works is that it takes the search string and tokenizes it around > blanks and + signs. Blanks are interpreted as ORs and + as ANDs. Hence for a > search like "word1+word2 word3", it will return the following string: > > "text" ILIKE '%word1%' AND "text" ILIKE '%word2%' OR "text" ILIKE '%word3%' > > Pretty rough, but it works. Hope this can help somehow. Actually, has anyone looked at throwing htdig on top of the idocs site? The newest betas are pretty stable (we are using them in production at work) and allow phrase searching, and can do either an ALL, AND, or Boolean type of search. It includes soundex fuzzy matching, as well as many other features (i.e. expands work to works, working, worked, etc...) It's a great search engine and VERY fast. I love postgresql, but for this kind of thing, I prefer htdig.
Sorry to Scott for the reply to his mail only. I wish evolution would reply to the list automatically instead of the original sender. Is there a historical reason these lists do not use a Reply-To:pgsql-whatever@postgresql.org in the headers? On Thu, 2002-12-12 at 10:15, scott.marlowe wrote: > On Thu, 12 Dec 2002, Francois Suter wrote: > > > Hi all, > > > > I don't know who is responsible for the PostgreSQL's interactive > > documentation so I'm writing to the general ML. While we are on the subject of docs, I will show my ignorance and ask how do I convert the sgml doc provided in the redhat rpms to html for local reading? then I could search my own doc tree. I don't know (*&%# about sgml. I tried sqml2html but it barfed telling me that it only understood linuxdoc format. I have to admit I only spent about 30 minutes looking for the solution. Bret
Just use reply-to-all from evolution. Historically there have been times when posting via the mailing lists have been slow, so direct emails have become a defacto standard for these lists, with the list being cc'd to archive things. Robert Treat On Thu, 2002-12-12 at 13:33, Bret Hughes wrote: > Sorry to Scott for the reply to his mail only. I wish evolution would > reply to the list automatically instead of the original sender. Is > there a historical reason these lists do not use a > Reply-To:pgsql-whatever@postgresql.org in the headers? > > On Thu, 2002-12-12 at 10:15, scott.marlowe wrote: > > On Thu, 12 Dec 2002, Francois Suter wrote: > > > > > Hi all, > > > > > > I don't know who is responsible for the PostgreSQL's interactive > > > documentation so I'm writing to the general ML. > > While we are on the subject of docs, I will show my ignorance and ask > how do I convert the sgml doc provided in the redhat rpms to html for > local reading? then I could search my own doc tree. > > I don't know (*&%# about sgml. I tried sqml2html but it barfed telling > me that it only understood linuxdoc format. I have to admit I only spent > about 30 minutes looking for the solution. > > Bret > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
Yes I'm also in favor of grabbing one of those 'my search engine' offerings by google and other vendor/providers...too late to develope yet another search engine... Oleg Bartunov wrote: >Why not just use OpenFTS for full text searching in idocs or just tsearch ? >It should works for such collection perfectly ! > > Oleg >On Thu, 12 Dec 2002, Francois Suter wrote: > > > >>Hi all, >> >>I don't know who is responsible for the PostgreSQL's interactive >>documentation so I'm writing to the general ML. >> >>As you probably all noticed, the idocs are pretty limited because it is not >>possible to search several words at the same time (either with AND or OR), >>as explained in the search help. >> >>Since I recently had to develop something similar (in PHP), here is the code >>(attached file), if it can be useful and make the idocs better. It's maybe >>not the most efficient algorithm in the world, because it's the first >>version and I didn't yet have time to optimize it, but it does the job. >> >>Usage is as follows: >> >>- assume the search text field is called "search" >>- assume the text field in the database is called "text" >> >>you can write something like that: >> >>$query = "SELECT * FROM table_name WHERE >>".searchString("search","text",true,false); >> >>The third parameter is true for a case-insensitive search (using ILIKE). The >>fourth parameters forces all words in the search string to be considered >>with "AND". >> >>The way it works is that it takes the search string and tokenizes it around >>blanks and + signs. Blanks are interpreted as ORs and + as ANDs. Hence for a >>search like "word1+word2 word3", it will return the following string: >> >>"text" ILIKE '%word1%' AND "text" ILIKE '%word2%' OR "text" ILIKE '%word3%' >> >>Pretty rough, but it works. Hope this can help somehow. >> >>-------- >>Francois >> >>Home page: http://www.monpetitcoin.com/ >>"We waste our time spending money we don't have to buy things we don't need >>to impress people we don't like" >> >> >> >> > > Regards, > Oleg >_____________________________________________________________ >Oleg Bartunov, sci.researcher, hostmaster of AstroNet, >Sternberg Astronomical Institute, Moscow University (Russia) >Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ >phone: +007(095)939-16-83, +007(095)939-23-83 > > >---------------------------(end of broadcast)--------------------------- >TIP 4: Don't 'kill -9' the postmaster > >
On Thu, 2002-12-12 at 12:54, Robert Treat wrote: > Just use reply-to-all from evolution. Historically there have been times > when posting via the mailing lists have been slow, so direct emails have > become a defacto standard for these lists, with the list being cc'd to > archive things. > > Robert Treat > Thanks Robert.
There's also google ;). Example search: site:postgresql.org "interactive documentation" tuning wal -optimizer It's convenient that the phrase "interactive documentation" is on every idoc page ;). Leave that phrase out if you want to search the whole site. You can also read the docs using google's cache - e.g. if idocs goes down or you want to save postgresql.org's bandwidth and don't mind slightly older docs. <grin type=mischievous > It's a great search engine and very fast, etc... ;) </grin> Yah I know, shouldn't take google for granted, it's good for a site to have it's own search engine. Unfortunately, I've found that Google's search works better than most sites own search engines. Either 0 documents found, or 11523 documents found with useless marketing/PR junk topmost. Still most crawlers don't help for private sites with access controls - e.g. search results different depending on your access (can't have secret docs turning in everyone's searches :) ). Cheerio, Link. At 09:15 AM 12/12/02 -0700, scott.marlowe wrote: >Actually, has anyone looked at throwing htdig on top of the idocs site? >The newest betas are pretty stable (we are using them in production at >work) and allow phrase searching, and can do either an ALL, AND, or >Boolean type of search. It includes soundex fuzzy matching, as well as >many other features (i.e. expands work to works, working, worked, etc...) > >It's a great search engine and VERY fast. I love postgresql, but for this >kind of thing, I prefer htdig.
On Fri, 13 Dec 2002 04:36:32 +0800, Lincoln Yeoh <lyeoh@pop.jaring.my> wrote: >You can also read the docs using google's cache ... with the added benefit that your search words are highlighted :-) Servus Manfred