I wrote:
> > I found a possible other way to bring the size of the transition table
> > under 32k entries while keeping the existing no-backup rules in place:
> > Replace the "quotecontinue" rule with a new state. In the attached
> > draft patch, when Flex encounters a quote while inside any kind of
> > quoted string, it saves the current state and enters %xqs (think
> > 'quotestop'). If it then sees {whitespace_with_newline}{quote}, it
> > reenters the previous state and continues to slurp the string,
> > otherwise, it throws back everything and returns the string it just
> > exited. Doing it this way is a bit uglier, but with some extra
> > commentary it might not be too bad.
>
> I had an epiphany and managed to get rid of the backup states.
> Regression tests pass. The array is down to 30367 entries and the
> binary is smaller by 172kB on Linux x86-64. Performance is identical
> to master on both tests mentioned upthread. I'll clean this up and add
> it to the commitfest.
For the commitfest:
0001 is a small patch to remove some unneeded generality from the
current rules. This lowers the number of elements in the yy_transition
array from 37045 to 36201.
0002 is a cleaned up version of the above, bring the size down to 29521.
I haven't changed psqlscan.l or pgc.l, in case this approach is
changed or rejected
With the two together, the binary is about 175kB smaller than on HEAD.
I also couldn't resist playing around with the idea upthread to handle
unicode escapes in parser.c, which further reduces the number of
states down to 21068, which allows some headroom for future additions
without going back to 32-bit types in the transition array. It mostly
works, but it's quite ugly and breaks the token position handling for
unicode escape syntax errors, so it's not in a state to share.
--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services