Re: massive quotes? - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: massive quotes?
Date
Msg-id 3F61D92C.5000300@dunslane.net
Whole thread Raw
In response to Re: massive quotes?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: massive quotes?
List pgsql-hackers
Tom Lane wrote:

>After sleeping on it, I do think that tying the mechanism to newlines
>is just unnecessary complication.  I'm currently leaning to an idea that
>was suggested yesterday by (I think) Andreas: let the quote start marker
>be a token of the form
>    dollarsign zero-or-more-letters dollarsign
>and let the quote body extend to the next occurrence of the identical
>string.  For example
>    ... $Q$Joe's house$Q$ ...
>is equivalent to
>    ... 'Joe''s house' ...
>
>This is extremely compact for quoting strings that don't contain any
>doubled dollar signs, since you don't need any letters at all.  I could
>see $$text$$ becoming a very common way to quote material that contains
>single quotes or backslashes.  But since you can choose any string of
>letters to make up the terminating token, the mechanism is able to quote
>any text whatever, including nested occurrences of the same structure
>(with a different letterstring of course).
>
>Note that there is no particular need to insist on any nearby newlines.
>If the construct is written just following an identifier or keyword,
>then you do need some intervening whitespace to keep the $Q$ from being
>read as part of that identifier, but I doubt this will bother anyone.
>
>Note that I'm allowing only letters, not digits, in the string; this
>avoids any possible ambiguity with $n parameter tokens.  We have no
>other SQL tokens that are allowed to start with $, so this creates no
>other lexical ambiguity.
>
>Comments?
>
>  
>
I like it. It is really quite similar to perl's q$text$ mechanism, but 
making allowances for the fact we are in a multi-language environment.

I presume the delimiter will never be kept, but eaten by the lexer. I'd 
like to see pg_dump use this mechanism for quoting, at least for 
function bodies. I guess it could retrieve the text and then keep 
generating delimiters until it found one that didn't occur inside the 
text. Maybe for that purpose we could allow underscores as well as 
letters - I don't think that should introduce any extra ambiguities. 
Alternatively, or as well, maybe leading and trailing digits could be 
disallowed, but embedded digits could be allowed. IOW let's be as 
liberal as possible without breaking things.

cheers

andrew



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: massive quotes?
Next
From: Tom Lane
Date:
Subject: Re: [PATCHES] Reorganization of spinlock defines