Re: website doc search is extremely SLOW - Mailing list pgsql-general

From Dave Cramer
Subject Re: website doc search is extremely SLOW
Date
Msg-id 1072871834.2937.221.camel@localhost.localdomain
Whole thread Raw
In response to Re: website doc search is extremely SLOW  ("Marc G. Fournier" <scrappy@postgresql.org>)
Responses Re: website doc search is extremely SLOW  ("John Sidney-Woollett" <johnsw@wardbrook.com>)
Re: website doc search is extremely SLOW  ("Marc G. Fournier" <scrappy@postgresql.org>)
List pgsql-general
Marc,

No it doesn't spider, it is a specialized tool for searching documents.

I'm curious, what value is there to being able to count the number of
url's ?

It does do things like query all documents where CREATE AND TABLE are n
words apart, just as fast, I would think these are more valuable to
document searching?

I think the challenge here is what do we want to search. I am betting
that folks use this page as they would man? ie. what is the command for
create trigger?

As I said my offer stands to help out, but I think if the goal is to
search the entire website, then this particular tool is not useful.

At this point I am working on indexing the sgml directly as it has less
cruft in it. For instance all the links that appear in every summary are
just noise.


Dave

On Wed, 2003-12-31 at 00:44, Marc G. Fournier wrote:
> On Wed, 31 Dec 2003, Dave Cramer wrote:
>
> > I can modify mine to be client server if you want?
> >
> > It is a java app, so we need to be able to run jdk1.3 at least?
>
> jdk1.4 is available on the VMs ... does your spider?  for instance, you
> mention that you have the docs indexed right now, but we are currently
> indexing:
>
> Server http://archives.postgresql.org/
> Server http://advocacy.postgresql.org/
> Server http://developer.postgresql.org/
> Server http://gborg.postgresql.org/
> Server http://pgadmin.postgresql.org/
> Server http://techdocs.postgresql.org/
> Server http://www.postgresql.org/
>
> will it be able to handle:
>
> 186_archives=# select count(*) from url;
>  count
> --------
>  393551
> (1 row)
>
> as fast as you are finding with just the docs?
>
> ----
> Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664
>
--
Dave Cramer
519 939 0336
ICQ # 1467551


pgsql-general by date:

Previous
From: "Peter Eisentraut"
Date:
Subject: Re: 'like' refuses to use an index???
Next
From: "John Sidney-Woollett"
Date:
Subject: Re: website doc search is extremely SLOW