Re: Replacing plpgsql's lexer - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Replacing plpgsql's lexer
Date
Msg-id 4136ffa0904150350j222f1e6brfdca27652d3986fd@mail.gmail.com
Whole thread Raw
In response to Re: Replacing plpgsql's lexer  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On Wed, Apr 15, 2009 at 11:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
>> This is a fundamental conflict, not one that has a single simple answer.
>>
>> However this seems like a strange place to pick your battle.
>
> I think you are right that you perceive a fundamental conflict and most
> things I say become battles. That is not my choice and I will withdraw
> from further discussion. My point has been made clearly and has not been
> made to cause conflict. I've better things to do with my time than that,
> though it's a shame you think that of me.

Uhm, I didn't intend this as criticism at all, except inasmuch as the
judgement about whether the plpgsql lexer was a good choice of place
to make this stand. The use of "battle" was only because of the idiom
"pick your battle".

I think we are in general too conservative about making changes and
you are concerned that we're not giving enough thought to the upgrade
pain and should be more conservative. We can talk about general
policies but ultimately we'll have to debate each change on its
merits.

In this case it would help if we described the specific kinds of code
and consequences users. I'm not sure we're all on the same page.

I think changing the lexer to match the SQL lexer will only affect
string constants and only if standards_conforming_strings is enabled,
and only those instances which are handled internally to plpgsql and
not passed to the SQL engine. So the fix will pretty much always be
local to the behaviour change. It's possible for an escaped string to
need an E'' and for the backslash to migrate to other parts of the
code before triggering a bug (or possibly even get stored in the
database and cause a problem in other parts of the application). But
it should still be pretty straightforward to find the original source
of the string and also pretty easy to recognize string constants
throughout the source code.

As it currently stands a programmer sometimes has to use E'\x' and
sometimes has to use '\x' depending on whether the plpgsql is lexing
the string or is passing it to the SQL engine unlexed. It's not
obvious which parts get handled in which way to a user since some
constructs are handled as SQL which don't appear to be SQL and vice
versa -- at least it's not obvious to me even having read the source
in the past.

If I understand things correctly I think the change improves the
language for future users by far more than it imposes maintenance
costs on existing users, especially considering that anyone depending
on '\x' strings with standards_conforming_strings enabled is only
probably getting it wrong in some places without realizing it anyways

.

-- 
greg


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Why isn't stats_temp_directory automatically created?
Next
From: Heikki Linnakangas
Date:
Subject: Re: Replacing plpgsql's lexer