Thread: Searching http://www.postgresql.org ...

Searching http://www.postgresql.org ...

From
The Hermit Hacker
Date:
Well, finally have a good portion of the web site now searchable...the
following "sections" of other web site is available:

Server http://www.postgresql.org/docs
Server http://www.postgresql.org/mhonarc/pgsql-admin
Server http://www.postgresql.org/mhonarc/pgsql-announce
Server http://www.postgresql.org/mhonarc/pgsql-bugs
Server http://www.postgresql.org/mhonarc/pgsql-docs
Server http://www.postgresql.org/mhonarc/pgsql-general
Server http://www.postgresql.org/mhonarc/pgsql-hackers
Server http://www.postgresql.org/mhonarc/pgsql-mirrors
Server http://www.postgresql.org/mhonarc/pgsql-novice
Server http://www.postgresql.org/mhonarc/pgsql-sql

On Tues/Weds of this week, we're having a 9.1gb drive installed that will
be dedicated to UdmSearch, so the rest of the site will be indexed at that
time...

For now, if you go to:

    http://www.postgresql.org/search.cgi

You can make use of the search engine...

Just to give you an idea of the size of the tables that are currently
loaded:

    dict: 6856951 tuples in ~373Meg
     url:   46416 tuples in ~ 23Meg

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org


Re: [ANNOUNCE] Searching http://www.postgresql.org ...

From
Hannu Krosing
Date:
The Hermit Hacker wrote:
>
> Well, finally have a good portion of the web site now searchable...the
> following "sections" of other web site is available:

Having implemented a full-text index myself a few years ago, i tried some
more taxing queries (those that return most of the pages, and that could
(should) be banned by stop words or some smarter techniques (like
partitioning the search space by date or some other attribute first)

the results

of            31220 matches <10 sec
at            16002 matches <5 sec
PostgreSQL    24807 matches <10 sec
I             infinite, (maybe computer crash ?, www.postgresql.org also
unreachable)

please dont sue me ;)

----------
Hannu

Re: [ANNOUNCE] Searching http://www.postgresql.org ...

From
Hannu Krosing
Date:
Hannu Krosing wrote:
>
> The Hermit Hacker wrote:
> >
> > Well, finally have a good portion of the web site now searchable...the
> > following "sections" of other web site is available:
>
> Having implemented a full-text index myself a few years ago, i tried some
> more taxing queries (those that return most of the pages, and that could
> (should) be banned by stop words or some smarter techniques (like
> partitioning the search space by date or some other attribute first)
>
> the results
>
> of            31220 matches <10 sec
> at            16002 matches <5 sec
> PostgreSQL    24807 matches <10 sec
> I             infinite, (maybe computer crash ?, www.postgresql.org also
> unreachable)

Seems it survived (result after ~5 min) 35827 matches.

During the search www.postgresql.org was unreachable at least from my
computer.

I will stop testing for now.

----------------
Hannu

Re: [ANNOUNCE] Searching http://www.postgresql.org ...

From
The Hermit Hacker
Date:
On Mon, 10 Jan 2000, Hannu Krosing wrote:

> The Hermit Hacker wrote:
> >
> > Well, finally have a good portion of the web site now searchable...the
> > following "sections" of other web site is available:
>
> Having implemented a full-text index myself a few years ago, i tried some
> more taxing queries (those that return most of the pages, and that could
> (should) be banned by stop words or some smarter techniques (like
> partitioning the search space by date or some other attribute first)
>
> the results
>
> of            31220 matches <10 sec
> at            16002 matches <5 sec
> PostgreSQL    24807 matches <10 sec
> I             infinite, (maybe computer crash ?, www.postgresql.org also
> unreachable)
>
> please dont sue me ;)

You didn't crash anything, no worry...:)

As for stopwords, they do provide a mechanism for doing this, and, ummm, I
forgot to load it :)

Loaded now, so it should reduce as it gets recycled/expired...I don't want
to start this all from scratch again :(

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org


Re: [ANNOUNCE] Searching http://www.postgresql.org ...

From
Vince Vielhaber
Date:
On 10-Jan-00 Hannu Krosing wrote:
> The Hermit Hacker wrote:
>>
>> Well, finally have a good portion of the web site now searchable...the
>> following "sections" of other web site is available:
>
> Having implemented a full-text index myself a few years ago, i tried some
> more taxing queries (those that return most of the pages, and that could
> (should) be banned by stop words or some smarter techniques (like
> partitioning the search space by date or some other attribute first)
>
> the results
>
> of            31220 matches <10 sec
> at            16002 matches <5 sec
> PostgreSQL    24807 matches <10 sec
> I             infinite, (maybe computer crash ?, www.postgresql.org also
> unreachable)
>
> please dont sue me ;)
>
> ----------
> Hannu
>
> ************
>

You may have lost a route somewhere between you and hub.  I'm logged into
it now and it hasn't rebooted recently.

Vince.
--
==========================================================================
Vince Vielhaber -- KA8CSH    email: vev@michvhf.com    http://www.pop4.net
   128K ISDN: $24.95/mo or less - 56K Dialup: $17.95/mo or less at Pop4
        Online Campground Directory    http://www.camping-usa.com
       Online Giftshop Superstore    http://www.cloudninegifts.com
==========================================================================



Re: [ANNOUNCE] Searching http://www.postgresql.org ...

From
Tom Lane
Date:
Hannu Krosing <hannu@tm.ee> writes:
>> the results
>> of            31220 matches <10 sec
>> at            16002 matches <5 sec
>> PostgreSQL    24807 matches <10 sec
>> I             infinite, (maybe computer crash ?, www.postgresql.org also
>> unreachable)

> Seems it survived (result after ~5 min) 35827 matches.

> During the search www.postgresql.org was unreachable at least from my
> computer.

Sounds to me like you were seeing a transient network outage.  The
35k-row query probably didn't take *that* much longer than the 31k-row
query --- but maybe www.postgresql.org's packets couldn't get to you
for awhile.

            regards, tom lane