Tom Lane wrote:
>After sleeping on it, I do think that tying the mechanism to newlines
>is just unnecessary complication. I'm currently leaning to an idea that
>was suggested yesterday by (I think) Andreas: let the quote start marker
>be a token of the form
> dollarsign zero-or-more-letters dollarsign
>and let the quote body extend to the next occurrence of the identical
>string. For example
> ... $Q$Joe's house$Q$ ...
>is equivalent to
> ... 'Joe''s house' ...
>
>This is extremely compact for quoting strings that don't contain any
>doubled dollar signs, since you don't need any letters at all. I could
>see $$text$$ becoming a very common way to quote material that contains
>single quotes or backslashes. But since you can choose any string of
>letters to make up the terminating token, the mechanism is able to quote
>any text whatever, including nested occurrences of the same structure
>(with a different letterstring of course).
>
>Note that there is no particular need to insist on any nearby newlines.
>If the construct is written just following an identifier or keyword,
>then you do need some intervening whitespace to keep the $Q$ from being
>read as part of that identifier, but I doubt this will bother anyone.
>
>Note that I'm allowing only letters, not digits, in the string; this
>avoids any possible ambiguity with $n parameter tokens. We have no
>other SQL tokens that are allowed to start with $, so this creates no
>other lexical ambiguity.
>
>Comments?
>
>
>
I like it. It is really quite similar to perl's q$text$ mechanism, but
making allowances for the fact we are in a multi-language environment.
I presume the delimiter will never be kept, but eaten by the lexer. I'd
like to see pg_dump use this mechanism for quoting, at least for
function bodies. I guess it could retrieve the text and then keep
generating delimiters until it found one that didn't occur inside the
text. Maybe for that purpose we could allow underscores as well as
letters - I don't think that should introduce any extra ambiguities.
Alternatively, or as well, maybe leading and trailing digits could be
disallowed, but embedded digits could be allowed. IOW let's be as
liberal as possible without breaking things.
cheers
andrew