Re: Lexer patch question - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: Lexer patch question
Date
Msg-id 200506151736.j5FHaMM20283@candle.pha.pa.us
Whole thread Raw
In response to Re: Lexer patch question  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Lexer patch question
List pgsql-patches
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I am confused why the following change Tom made to scan.l works.
> > Isn't that 'x' required so xqescape doesn't match '\x'?
>
> > *** scan.l    2 Jun 2005 01:23:08 -0000    1.123
> > --- scan.l    2 Jun 2005 17:45:17 -0000    1.124
> > ***************
> > *** 193,199 ****
> >   xqstart            {quote}
> >   xqdouble        {quote}{quote}
> >   xqinside        [^\\']+
> > ! xqescape        [\\][^0-7x]
> >   xqoctesc        [\\][0-7]{1,3}
> >   xqhexesc        [\\]x[0-9A-Fa-f]{1,2}
>
> > --- 193,199 ----
> >   xqstart            {quote}
> >   xqdouble        {quote}{quote}
> >   xqinside        [^\\']+
> > ! xqescape        [\\][^0-7]
> >   xqoctesc        [\\][0-7]{1,3}
> >   xqhexesc        [\\]x[0-9A-Fa-f]{1,2}
>
> No; if a match to xqhexesc is possible, the lexer will prefer that match
> because it is longer.  If a match to xqhexesc is not possible --- that
> is, we have \x not followed by a hex digit --- then we *want* xqescape
> to match.  The original coding forced a backup to the <xq>. rule in this
> situation, which is not how we want it to behave.

Oh, I didn't realize lexers would choose the longer token when given
multiple options.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: Lexer patch question
Next
From: Tom Lane
Date:
Subject: Re: Lexer patch question