Re: Bug with Tsearch and tsvector - Mailing list pgsql-bugs

From Kevin Grittner
Subject Re: Bug with Tsearch and tsvector
Date
Msg-id 4BD6B1740200002500030E64@gw.wicourts.gov
Whole thread Raw
In response to Re: Bug with Tsearch and tsvector  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> We'd probably not want to apply this as-is, but should first
>>> tighten up what characters URLPath allows, per Kevin's spec
>>> research.
>
>> If we're headed that way, I figured I should double-check.  The
>> RFC I referenced was later obsoleted by:
>> http://www.ietf.org/rfc/rfc3986.txt
>
> On reflection, since we're changing the behavior anyway, it seems
> like the most defensible thing to do is make the TS parser follow
> the RFC's allowed character set exactly.  The newer RFC doesn't
> restrict '#' so that possible corner case is gone.

It seems worth mentioning that there is a BSD licensed URI parser on
sourceforge:

http://uriparser.sourceforge.net/

I'm not advocating for using it, I just ran across it and it seemed
of possible interest.

-Kevin

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bug with Tsearch and tsvector
Next
From: "Kevin Grittner"
Date:
Subject: Re: Bug with Tsearch and tsvector