Re: Bug with Tsearch and tsvector - Mailing list pgsql-bugs

From Kevin Grittner
Subject Re: Bug with Tsearch and tsvector
Date
Msg-id 4BD592F80200002500030DF9@gw.wicourts.gov
Whole thread Raw
In response to Re: Bug with Tsearch and tsvector  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bug with Tsearch and tsvector  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> ie the critical point seems to be that url_path is willing to soak
> up a string containing "<" and ">", so the span tags don't get
> recognized as separate lexemes.  While that's "obviously" the
> wrong thing in this particular example, I'm not sure if it's the
> wrong thing in general. Can anyone comment on the frequency of
> usage of those two symbols in URLs?

http://www.ietf.org/rfc/rfc2396.txt section 2.4.3 "delims" expressly
forbids their use in URIs.

> In any case it's weird that the URL lexeme doesn't span the same
> text as the url_path one, but I'm not sure which one we should
> consider wrong.

In spite of the above prohibition, I notice that firefox and wget
both seem to *try* to use such characters if they're included.

-Kevin

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bug with Tsearch and tsvector
Next
From: Tom Lane
Date:
Subject: Re: Bug with Tsearch and tsvector