Re: BUG #5075: Text Search parser does not identify xml tag when attribute name's contains underscore - Mailing list pgsql-bugs

From Robert Haas
Subject Re: BUG #5075: Text Search parser does not identify xml tag when attribute name's contains underscore
Date
Msg-id 603c8f070909271949r4c3874e9y85f1bad2fdd7eb20@mail.gmail.com
Whole thread Raw
In response to Re: BUG #5075: Text Search parser does not identify xml tag when attribute name's contains underscore  (Euler Taveira de Oliveira <euler@timbira.com>)
Responses Re: BUG #5075: Text Search parser does not identify xml tag when attribute name's contains underscore
List pgsql-bugs
On Wed, Sep 23, 2009 at 7:31 PM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:
> Marek Lewczuk escreveu:
>> Please execute following example:
>> select * from ts_debug('english', '<img width="182" height="120"
>> align="right" style="margin: 0px 0px 5px 5px;" test_aa="26461"/>')
>>
>> As the result you will see, that <img/> is not identified as XML tag, but
>> rather splitted as words, blank spaces etc. The reason for that is the fact,
>> that last attribute "test_aa" contains underscore in its name - when the
>> underscore is removed, then img tag is properly identified as XML tag.
>>
>> XML definition allows using underscore in tag and attribute names.
>>
> The problem is we already allow it in tag names but not in attribute names. So
> the proper fix is to allow underscore when the state is TPS_InTag; according
> to XML spec [1], the underscore is a valid character in attribute names.
>
> A possible downside is that we don't have underscores in HTML attribute names.
> In this case, should it fail? I don't think so but...
>
> The problem exists in 8.3, 8.4 and HEAD. It is a trivial fix so I think there
> isn't a problem to back-patch it.

This patch should probably be added to
https://commitfest.postgresql.org/action/commitfest_view/open so that
we don't lose track of it.

...Robert

pgsql-bugs by date:

Previous
From: Robert Haas
Date:
Subject: Re: PROBLEMA AL INSTALAR POSTSGRESQL
Next
From: Selena Deckelmann
Date:
Subject: Re: BUG #5075: Text Search parser does not identify xml tag when attribute name's contains underscore