Thread: pgsql: Fix support of digits in email/hostnames.
Fix support of digits in email/hostnames. When tsearch was implemented I did several mistakes in hostname/email definition rules: 1) allow underscore in hostname what prohibited by RFC 2) forget to allow leading digits separated by hyphen (like 123-x.com) in hostname 3) do no allow underscore/hyphen after leading digits in localpart of email Artur's patch resolves two last issues, but by the way allows hosts name like 123_x.com together with 123-x.com. RFC forbids underscore usage in hostname but pg allows that since initial tsearch version in core, although only for non-digits. Patch syncs support digits and nondigits in both hostname and email. Forbidding underscore in hostname may break existsing usage of tsearch and, anyhow, it should be done by separate patch. Author: Artur Zakirov BUG: #13964 Branch ------ master Details ------- http://git.postgresql.org/pg/commitdiff/61d66c44f18c73094a50a2ef97d26cc03e171dc0 Modified Files -------------- src/backend/tsearch/wparser_def.c | 3 +++ src/test/regress/expected/tsearch.out | 22 ++++++++++++++-------- src/test/regress/sql/tsearch.sql | 6 +++--- 3 files changed, 20 insertions(+), 11 deletions(-)
On Tue, Mar 29, 2016 at 03:29:20PM +0000, Teodor Sigaev wrote: > Fix support of digits in email/hostnames. > > When tsearch was implemented I did several mistakes in hostname/email > definition rules: > 1) allow underscore in hostname what prohibited by RFC > 2) forget to allow leading digits separated by hyphen (like 123-x.com) > in hostname > 3) do no allow underscore/hyphen after leading digits in localpart of email > > Artur's patch resolves two last issues, but by the way allows hosts name like > 123_x.com together with 123-x.com. RFC forbids underscore usage in hostname > but pg allows that since initial tsearch version in core, although only > for non-digits. Patch syncs support digits and nondigits in both hostname and > email. > > Forbidding underscore in hostname may break existsing usage of tsearch and, > anyhow, it should be done by separate patch. > > Author: Artur Zakirov > BUG: #13964 Doesn't this invalidate tsvector indexes upgraded by pg_upgrade? Should they be marked as invalid? Can you also fix the other two TODO items related to this? Improve handling of dash and plus signs in email address user names, and perhaps improve URL parsing http://www.postgresql.org/message-id/201010122203.o9CM3RW09263@momjian.us http://www.postgresql.org/message-id/E1Ri8il-0008Ct-9p@wrigleys.postgresql.org -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
> Doesn't this invalidate tsvector indexes upgraded by pg_upgrade? Should > they be marked as invalid? Directly, it affects on functional indexes i.e. over to_tsvector(). But it affects tsvector column, it should be recreated if it was generated by ts_vector() function. > > Can you also fix the other two TODO items related to this? > > Improve handling of dash and plus signs in email address > user names, and perhaps improve URL parsing > > http://www.postgresql.org/message-id/201010122203.o9CM3RW09263@momjian.us > > http://www.postgresql.org/message-id/E1Ri8il-0008Ct-9p@wrigleys.postgresql.org > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Fri, Apr 29, 2016 at 01:20:35PM +0300, Teodor Sigaev wrote: > >Doesn't this invalidate tsvector indexes upgraded by pg_upgrade? Should > >they be marked as invalid? > Directly, it affects on functional indexes i.e. over to_tsvector(). But it > affects tsvector column, it should be recreated if it was generated by > ts_vector() function. OK, so every tsvector column or expression index needs to be reported by pg_upgrade? Do we want to fix everything else in this same release? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Fri, Apr 29, 2016 at 06:43:04AM -0400, Bruce Momjian wrote: > On Fri, Apr 29, 2016 at 01:20:35PM +0300, Teodor Sigaev wrote: > > >Doesn't this invalidate tsvector indexes upgraded by pg_upgrade? Should > > >they be marked as invalid? > > Directly, it affects on functional indexes i.e. over to_tsvector(). But it > > affects tsvector column, it should be recreated if it was generated by > > ts_vector() function. > > OK, so every tsvector column or expression index needs to be reported by > pg_upgrade? Do we want to fix everything else in this same release? I guess my point is that we should do all pg_upgrade-breaking tsvector changes in a single release so we don't need to invalidate tsvector columns and indexes in two releases. If we can't do them all in 9.6, perhaps we should revert this change and do them all in 9.7. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +