Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores
Date
Msg-id 201003130048.o2D0mOP16522@momjian.us
Whole thread Raw
In response to Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores  (Teodor Sigaev <teodor@sigaev.ru>)
Responses Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores
List pgsql-hackers
Teodor Sigaev wrote:
> > Oleg, Teodor, can you look at this?  I tried to fix it in wparser_def.c,
> > but couldn't figure out how.  Thanks.
> >>
> >> select distinct token as email
> >> from ts_parse('default', ' first_last@yahoo.com '   )
> >> where tokid = 4
> 
> Patch in attachment, it allows underscore in the middle of local part of email 
> in in host name (similarly to '-' character).

Thanks, patch applied.

> I'm not sure about backpatching, because it could break existing search 
> configuration.

Agreed.  I don't think this warrants backpatching.

Here is the before behavior:
test=> select ts_parse('default', ' first_last@yahoo.com '   );      ts_parse-------------------- (12," ") (1,first)
(12,_)
-->     (4,last@yahoo.com) (12," ")(5 rows)

and the after-patch, fixed behavior:
test=> select ts_parse('default', ' first_last@yahoo.com '   );         ts_parse-------------------------- (12," ")
-->     (4,first_last@yahoo.com) (12," ")(3 rows)

I assume because this only expands the pattern space for email addresses
that there is no affect on binary upgrades with this patch.  Is that
correct?  Would an email address check on a binary-upgraded tsvector
index not match an email address with underscores?  Do we need a warning
in the release notes about this?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do


pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Reposnse from backend when wrong user/database request send
Next
From: Alvaro Herrera
Date:
Subject: Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores