Thread: massive quotes?
What is the state of things regarding having to use massive strings of quotes like this (taken from the 7.3 docs)?: a_output := a_output || '' if v_'' || referrer_keys.kind || '' like '''''''''' || referrer_keys.key_string || '''''''''' then return '''''' || referrer_keys.referrer_type || ''''''; end if;''; This is truly ugly, IMNSHO. Perl has its q() construct - any chance of us doing something here? I'm prepared to put in effort to implement a solution if there is agreement on what the solution should be. Maybe some sortof magical operator/function? cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > What is the state of things regarding having to use massive strings of > quotes like this (taken from the 7.3 docs)?: How much of that is already replaceable by the tactic of using quote_literal() wherever possible? It's safest to be using quote_literal anyway wherever you're interpolating a string value into a query. regards, tom lane
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>What is the state of things regarding having to use massive strings of >>quotes like this (taken from the 7.3 docs)?: >> >> > >How much of that is already replaceable by the tactic of using >quote_literal() wherever possible? It's safest to be using >quote_literal anyway wherever you're interpolating a string value >into a query. > > > I suspect to get something nice it would have to operate at the scanner/parser layer. cheers andrew
On Thu, 28 Aug 2003, Andrew Dunstan wrote: > What is the state of things regarding having to use massive strings of > quotes like this (taken from the 7.3 docs)?: > > a_output := a_output || '' if v_'' || > referrer_keys.kind || '' like '''''''''' > || referrer_keys.key_string || '''''''''' > then return '''''' || referrer_keys.referrer_type > || ''''''; end if;''; > > This is truly ugly, IMNSHO. Perl has its q() construct - any chance of > us doing something here? > > I'm prepared to put in effort to implement a solution if there is > agreement on what the solution should be. Maybe some sort of magical > operator/function? I was thinking the most natural thing would be to use something similar to COPY's stdin quoting: CREATE FUNCTION bob() RETURNS INTEGER AS stdin LANGUAGE 'plpgsql'; BEGIN... END; \. Another possibility would be shell/Perl-style here documents, like this: CREATE FUNCTION bob() RETURNS INTEGER AS <<EOF BEGIN... END; EOF LANGUAGE 'plpgsql'; The former seems preferable since it uses a syntax PostgreSQL already supports in another context. Jon
Jon Jensen wrote: >On Thu, 28 Aug 2003, Andrew Dunstan wrote: > > > >>What is the state of things regarding having to use massive strings of >>quotes like this (taken from the 7.3 docs)?: >> >>a_output := a_output || '' if v_'' || >> referrer_keys.kind || '' like '''''''''' >> || referrer_keys.key_string || '''''''''' >> then return '''''' || referrer_keys.referrer_type >> || ''''''; end if;''; >> >>This is truly ugly, IMNSHO. Perl has its q() construct - any chance of >>us doing something here? >> >>I'm prepared to put in effort to implement a solution if there is >>agreement on what the solution should be. Maybe some sort of magical >>operator/function? >> >> > >I was thinking the most natural thing would be to use something similar to >COPY's stdin quoting: > >CREATE FUNCTION bob() RETURNS INTEGER AS stdin LANGUAGE 'plpgsql'; >BEGIN > ... >END; >\. > >Another possibility would be shell/Perl-style here documents, like this: > >CREATE FUNCTION bob() RETURNS INTEGER AS <<EOF >BEGIN > ... >END; >EOF >LANGUAGE 'plpgsql'; > >The former seems preferable since it uses a syntax PostgreSQL already >supports in another context. > >Jon > > Nice idea. I would probably never have thought of it :-) Makes function text almost first class, in the way that Oracle's is AFAICS, I also prefer the first version, not least because it hoists the LANGUAGE clause to the top where it seems to me it belongs. I have no idea how hard this would be. cheers andrew
Is there a TODO here? The only problem I see is that it introduces the idea of special column-1 handling, which we don't have right now, except in COPY. --------------------------------------------------------------------------- Andrew Dunstan wrote: > Jon Jensen wrote: > > >On Thu, 28 Aug 2003, Andrew Dunstan wrote: > > > > > > > >>What is the state of things regarding having to use massive strings of > >>quotes like this (taken from the 7.3 docs)?: > >> > >>a_output := a_output || '' if v_'' || > >> referrer_keys.kind || '' like '''''''''' > >> || referrer_keys.key_string || '''''''''' > >> then return '''''' || referrer_keys.referrer_type > >> || ''''''; end if;''; > >> > >>This is truly ugly, IMNSHO. Perl has its q() construct - any chance of > >>us doing something here? > >> > >>I'm prepared to put in effort to implement a solution if there is > >>agreement on what the solution should be. Maybe some sort of magical > >>operator/function? > >> > >> > > > >I was thinking the most natural thing would be to use something similar to > >COPY's stdin quoting: > > > >CREATE FUNCTION bob() RETURNS INTEGER AS stdin LANGUAGE 'plpgsql'; > >BEGIN > > ... > >END; > >\. > > > >Another possibility would be shell/Perl-style here documents, like this: > > > >CREATE FUNCTION bob() RETURNS INTEGER AS <<EOF > >BEGIN > > ... > >END; > >EOF > >LANGUAGE 'plpgsql'; > > > >The former seems preferable since it uses a syntax PostgreSQL already > >supports in another context. > > > >Jon > > > > > > Nice idea. I would probably never have thought of it :-) Makes function > text almost first class, in the way that Oracle's is AFAICS, > > I also prefer the first version, not least because it hoists the > LANGUAGE clause to the top where it seems to me it belongs. > > I have no idea how hard this would be. > > > cheers > > andrew > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Is there a TODO here? The only problem I see is that it introduces the > idea of special column-1 handling, which we don't have right now, except > in COPY. I think the idea would be to execute what's effectively a COPY IN during the CREATE FUNCTION command, and then use the data so collected as the function body. It seems doable offhand, though we'd have to think about whether this breaks any aspect of the FE/BE protocol, or whether it's truly a user-friendly interface. regards, tom lane
On Sat, 30 Aug 2003, Bruce Momjian wrote: > > >I was thinking the most natural thing would be to use something similar to > > >COPY's stdin quoting: > > > > > >CREATE FUNCTION bob() RETURNS INTEGER AS stdin LANGUAGE 'plpgsql'; > > >BEGIN > > > ... > > >END; > > >\. > > Is there a TODO here? The only problem I see is that it introduces the > idea of special column-1 handling, which we don't have right now, except > in COPY. I wouldn't mind taking on this job for 7.5. Yes, I think it's worth adding to the TODO list. Jon
There's certainly a TODO, but I'm not yet 100% sure that the solution Jon proposed is best. I'm looking at it as we write, and making it work looks to be somewhat complex. If we go that road we would probably need to generalise somewhat the COPY code. The alternative in my mind is some function foo such that foo(e) = 'e' Comments are welcome - this is virgin ground for me :-) andrew Bruce Momjian wrote: >Is there a TODO here? The only problem I see is that it introduces the >idea of special column-1 handling, which we don't have right now, except >in COPY. > >--------------------------------------------------------------------------- > >Andrew Dunstan wrote: > > >>Jon Jensen wrote: >> >> >> >>>On Thu, 28 Aug 2003, Andrew Dunstan wrote: >>> >>> >>> >>> >>> >>>>What is the state of things regarding having to use massive strings of >>>>quotes like this (taken from the 7.3 docs)?: >>>> >>>>a_output := a_output || '' if v_'' || >>>> referrer_keys.kind || '' like '''''''''' >>>> || referrer_keys.key_string || '''''''''' >>>> then return '''''' || referrer_keys.referrer_type >>>> || ''''''; end if;''; >>>> >>>>This is truly ugly, IMNSHO. Perl has its q() construct - any chance of >>>>us doing something here? >>>> >>>>I'm prepared to put in effort to implement a solution if there is >>>>agreement on what the solution should be. Maybe some sort of magical >>>>operator/function? >>>> >>>> >>>> >>>> >>>I was thinking the most natural thing would be to use something similar to >>>COPY's stdin quoting: >>> >>>CREATE FUNCTION bob() RETURNS INTEGER AS stdin LANGUAGE 'plpgsql'; >>>BEGIN >>> ... >>>END; >>>\. >>> >>>Another possibility would be shell/Perl-style here documents, like this: >>> >>>CREATE FUNCTION bob() RETURNS INTEGER AS <<EOF >>>BEGIN >>> ... >>>END; >>>EOF >>>LANGUAGE 'plpgsql'; >>> >>>The former seems preferable since it uses a syntax PostgreSQL already >>>supports in another context. >>> >>>Jon >>> >>> >>> >>> >>Nice idea. I would probably never have thought of it :-) Makes function >>text almost first class, in the way that Oracle's is AFAICS, >> >>I also prefer the first version, not least because it hoists the >>LANGUAGE clause to the top where it seems to me it belongs. >> >>I have no idea how hard this would be. >> >> >>cheers >> >>andrew >> >> >>
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Is there a TODO here? The only problem I see is that it introduces the > > idea of special column-1 handling, which we don't have right now, except > > in COPY. > > I think the idea would be to execute what's effectively a COPY IN during > the CREATE FUNCTION command, and then use the data so collected as the > function body. It seems doable offhand, though we'd have to think about > whether this breaks any aspect of the FE/BE protocol, or whether it's > truly a user-friendly interface. Wow, imagine libpq creating a function. Doesn't it just pass the entire string to the server? How would COPY IN handle that? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Tom Lane wrote: >> I think the idea would be to execute what's effectively a COPY IN during >> the CREATE FUNCTION command, and then use the data so collected as the >> function body. It seems doable offhand, though we'd have to think about >> whether this breaks any aspect of the FE/BE protocol, or whether it's >> truly a user-friendly interface. > Wow, imagine libpq creating a function. Doesn't it just pass the > entire string to the server? How would COPY IN handle that? I don't think libpq would know the difference. It certainly has no idea that a command that you send to the backend is a COPY --- it only finds that out when the Start Copy In message comes back from the backend. I believe the same is true of psql. So if a CREATE FUNCTION command were to issue Start Copy In, both those layers would happily cooperate in passing through everything you type up to "\.". Then we take that pile of bits and use it as the function body string in the rest of CREATE FUNCTION. (None of this need have anything to do with the existing commands/copy.c backend code, either, though some small part of the code might be worth sharing.) Other client libraries such as jdbc might need more surgery, though, if they have more smarts about COPY than libpq does. I'm also wondering why the backend need have anything at all to do with an improved function-definition mode. If you look in the archives you will see speculation about inventing psql backslash commands that would assist in entering raw function definitions by taking something you type and suitably quotifying it. This seems to me to be a cleaner and more localized approach to getting much the same functionality. regards, tom lane
Tom Lane wrote: >I'm also wondering why the backend need have anything at all to do with >an improved function-definition mode. If you look in the archives you >will see speculation about inventing psql backslash commands that would >assist in entering raw function definitions by taking something you type >and suitably quotifying it. This seems to me to be a cleaner and more >localized approach to getting much the same functionality. > > > This would suit my needs, as I use psql to do my db setup. But it would introduce a possible disconnect between psql and other interfaces, e.g. pgadmin, wouldn't it? Might it not be better to do something that was at least available to all clients, rather than make them all have to do their own quote escaping? Another thought: "create function foo() as stdin ..." reads a bit strangely, to me. Perhaps "create function foo() from stdin ..." or "create function foo() inline ..." might be better. The thing that set me off on this track was that I was trying to analyse a pure sql function's performance, so I was C&P'ing to and from my definition file , and constantly having to escape/unescape quotes, and it got damned annoying. I'm not dogmatic about how it is done, though, merely keen to ensure *something* is done. If we are trying to impress people with useability, this is one area where there's a gain to be made, with luck at not much cost. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> I'm also wondering why the backend need have anything at all to do with >> an improved function-definition mode. If you look in the archives you >> will see speculation about inventing psql backslash commands that would >> assist in entering raw function definitions by taking something you type >> and suitably quotifying it. > This would suit my needs, as I use psql to do my db setup. But it would > introduce a possible disconnect between psql and other interfaces, e.g. > pgadmin, wouldn't it? Might it not be better to do something that was at > least available to all clients, rather than make them all have to do > their own quote escaping? What makes you think that a COPY-based interface would be especially convenient for other frontends? In my mind this is entirely a user-interface matter, and as such is best solved at the user interface. psql has one set of needs, but a GUI app has totally different ones. I believe phpPgAdmin and so forth already have their own solutions to the quoting problem, anyway. regards, tom lane
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>Tom Lane wrote: >> >> >>>I'm also wondering why the backend need have anything at all to do with >>>an improved function-definition mode. If you look in the archives you >>>will see speculation about inventing psql backslash commands that would >>>assist in entering raw function definitions by taking something you type >>>and suitably quotifying it. >>> >>> > > > >>This would suit my needs, as I use psql to do my db setup. But it would >>introduce a possible disconnect between psql and other interfaces, e.g. >>pgadmin, wouldn't it? Might it not be better to do something that was at >>least available to all clients, rather than make them all have to do >>their own quote escaping? >> >> > >What makes you think that a COPY-based interface would be especially >convenient for other frontends? In my mind this is entirely a >user-interface matter, and as such is best solved at the user interface. >psql has one set of needs, but a GUI app has totally different ones. >I believe phpPgAdmin and so forth already have their own solutions to >the quoting problem, anyway. > I was speculating, that it might. But I'm quite prepared to accept that it wouldn't, and go for a purely psql solution, preferably one that pg_dump can understand and use. In that case, though, the solution will presumably look at least a bit different from those discussed so far in this thread. Or would you have psql detect that in place of a string there was "stdin" or whatever and then replace it with the inline string before passing it to the backend? cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > In that case, though, the solution will presumably look at least a bit > different from those discussed so far in this thread. Or would you have > psql detect that in place of a string there was "stdin" or whatever and > then replace it with the inline string before passing it to the backend? Please see the archives. I think that what was being discussed was something along the lines of foo=> CREATE FUNCTION myfunc(...) RETURNS ... ASfoo=> \beginliteralfoo'> type my function definition herefoo'> and herefoo'>\endliteralfoo-> LANGUAGE plpgsql; and psql would proceed to quotify whatever you entered between the two backslash commands. (Notice this could be used for any string-literal entry problem, not only CREATE FUNCTION.) I'm fuzzy on the details though; this may not have been the best idea presented. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > In that case, though, the solution will presumably look at least a bit > > different from those discussed so far in this thread. Or would you have > > psql detect that in place of a string there was "stdin" or whatever and > > then replace it with the inline string before passing it to the backend? > > Please see the archives. I think that what was being discussed was > something along the lines of > > foo=> CREATE FUNCTION myfunc(...) RETURNS ... AS > foo=> \beginliteral > foo'> type my function definition here > foo'> and here > foo'> \endliteral > foo-> LANGUAGE plpgsql; > > and psql would proceed to quotify whatever you entered between > the two backslash commands. (Notice this could be used for any > string-literal entry problem, not only CREATE FUNCTION.) I'm fuzzy on > the details though; this may not have been the best idea presented. Let me jump in --- there is the issue of how to prevent the backend from running the query through the lexer/parser. The cleanest idea presented was: >CREATE FUNCTION bob() RETURNS INTEGER AS stdin LANGUAGE 'plpgsql';>BEGIN> ...>END;>\. The interesting thing I missed at first viewing was that there is a semicolon after the first line. This allows the backend to go into a COPY-like mode where the client can pass lines to the backend bypassing the lexer/parser. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian said: > Let me jump in --- there is the issue of how to prevent the backend > from running the query through the lexer/parser. The cleanest idea > presented was: > > >CREATE FUNCTION bob() RETURNS INTEGER AS stdin LANGUAGE 'plpgsql'; > >BEGIN > > ... > >END; > >\. > > The interesting thing I missed at first viewing was that there is a > semicolon after the first line. This allows the backend to go into a > COPY-like mode where the client can pass lines to the backend bypassing > the lexer/parser. > If it is all processed in the front end as Tom is suggesting, the backend wouldn't see anything different from what it does now. That said, I rather like this syntax. cheers andrew
Andrew Dunstan wrote: > Bruce Momjian said: > > Let me jump in --- there is the issue of how to prevent the backend > > from running the query through the lexer/parser. The cleanest idea > > presented was: > > > > >CREATE FUNCTION bob() RETURNS INTEGER AS stdin LANGUAGE 'plpgsql'; > > >BEGIN > > > ... > > >END; > > >\. > > > > The interesting thing I missed at first viewing was that there is a > > semicolon after the first line. This allows the backend to go into a > > COPY-like mode where the client can pass lines to the backend bypassing > > the lexer/parser. > > > > If it is all processed in the front end as Tom is suggesting, the backend > wouldn't see anything different from what it does now. > > That said, I rather like this syntax. I don't think we want to push this into each client unless there is no other solution. Seems the libpq copy API could be utilized to handle this cleanly. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Let me jump in --- there is the issue of how to prevent the backend from > running the query through the lexer/parser. The cleanest idea presented > was: > [snip] That's the same idea that was being discussed in this thread. I was trying to remind people of the existence of prior discussion with different (and IMHO better) ideas. > The interesting thing I missed at first viewing was that there is a > semicolon after the first line. This allows the backend to go into a > COPY-like mode where the client can pass lines to the backend bypassing > the lexer/parser. What I don't like about this scheme is that it requires mods on both the backend and client sides, to solve a problem that could be solved as well or better (and definitely more simply) on the client side alone. People are being misled by considering only psql, which is so stupid that it might actually not need any change to be used with a COPY-based function definition mode. I doubt that is true of any other client. For a client that actually understands it's doing a function definition, this is *not* simpler nor better. Instead of running a string-literal-quotify transformation on the function text (which you must admit is trivial), the client has to support switching into the COPY protocol. That's not simpler. Add in any meaningful error recovery (what do you do if the backend doesn't switch into COPY mode?) and it's substantially more complex. regards, tom lane
These approaches are to some extent orthogonal, ISTM. Maybe the best solution would be to do as Tom suggests and just change psql (and, I would add, pg_dump) to act nicely here, and if other interface users/authors need/want something along these lines re-examine the issue. I use JDBC for my web clients in my current project, but I wouldn't create a function in the client. I probably wouldn't using DBI/DBD:Pg either. So taking myself as a sample of one perhaps it is true that it's only needed in psql :-) I'm all for minimal solutions as long as they look nice. cheers andrew Tom Lane wrote: >Bruce Momjian <pgman@candle.pha.pa.us> writes: > > >>Let me jump in --- there is the issue of how to prevent the backend from >>running the query through the lexer/parser. The cleanest idea presented >>was: >>[snip] >> >> > >That's the same idea that was being discussed in this thread. I was >trying to remind people of the existence of prior discussion with >different (and IMHO better) ideas. > > > >>The interesting thing I missed at first viewing was that there is a >>semicolon after the first line. This allows the backend to go into a >>COPY-like mode where the client can pass lines to the backend bypassing >>the lexer/parser. >> >> > >What I don't like about this scheme is that it requires mods on both the >backend and client sides, to solve a problem that could be solved as >well or better (and definitely more simply) on the client side alone. >People are being misled by considering only psql, which is so stupid >that it might actually not need any change to be used with a COPY-based >function definition mode. I doubt that is true of any other client. >For a client that actually understands it's doing a function definition, >this is *not* simpler nor better. Instead of running a >string-literal-quotify transformation on the function text (which you >must admit is trivial), the client has to support switching into the >COPY protocol. That's not simpler. Add in any meaningful error >recovery (what do you do if the backend doesn't switch into COPY mode?) >and it's substantially more complex. > > >
Tom Lane wrote: >What I don't like about this scheme is that it requires mods on both the >backend and client sides, to solve a problem that could be solved as >well or better (and definitely more simply) on the client side alone. >People are being misled by considering only psql, which is so stupid >that it might actually not need any change to be used with a COPY-based >function definition mode. I doubt that is true of any other client. >For a client that actually understands it's doing a function definition, >this is *not* simpler nor better. Instead of running a >string-literal-quotify transformation on the function text (which you >must admit is trivial), the client has to support switching into the >COPY protocol. That's not simpler. Add in any meaningful error >recovery (what do you do if the backend doesn't switch into COPY mode?) >and it's substantially more complex. > This is perfectly true for pgAdmin2/3 and probably most other tools. If using the create dialogs from pgAdmin3, you simply don't bother about that extra quotes because they are added as appropriate. Reducing quotes would be helpful for scripts containing function definitions, but it's obviously not a good idea to have a client executing such a script let it interpret to determine which part goes into the query execute interface and which part needs to use another. Why not regard anything between "AS" and LANGUAGE as source? This would prevent the usage of LANGUAGE as a keyword inside the function definition, but I can't think of any senseful other use anyway. Regards, Andreas
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Let me jump in --- there is the issue of how to prevent the backend from > > running the query through the lexer/parser. The cleanest idea presented > > was: > > [snip] > > That's the same idea that was being discussed in this thread. I was > trying to remind people of the existence of prior discussion with > different (and IMHO better) ideas. > > > The interesting thing I missed at first viewing was that there is a > > semicolon after the first line. This allows the backend to go into a > > COPY-like mode where the client can pass lines to the backend bypassing > > the lexer/parser. > > What I don't like about this scheme is that it requires mods on both the > backend and client sides, to solve a problem that could be solved as > well or better (and definitely more simply) on the client side alone. > People are being misled by considering only psql, which is so stupid > that it might actually not need any change to be used with a COPY-based > function definition mode. I doubt that is true of any other client. > For a client that actually understands it's doing a function definition, > this is *not* simpler nor better. Instead of running a > string-literal-quotify transformation on the function text (which you > must admit is trivial), the client has to support switching into the > COPY protocol. That's not simpler. Add in any meaningful error > recovery (what do you do if the backend doesn't switch into COPY mode?) > and it's substantially more complex. I was assuming when someone executes a COPY FROM STDIN in psql, that the backend returns something saying, "I am in COPY FROM STDIN mode, pass all text to me until I say stop". If that's how it works, wouldn't psql work with no changes. I am concerned about psql having to identify when you are creating a function using the STDIN format. How would it know? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> What I don't like about this scheme is that it requires mods on both the >> backend and client sides, to solve a problem that could be solved as >> well or better (and definitely more simply) on the client side alone. > backend returns something saying, "I am in COPY FROM STDIN mode, pass > all text to me until I say stop". If that's how it works, wouldn't psql > work with no changes. psql may indeed work with no changes, because it's too stupid to know when you're defining a function, and so it would not be surprised to get a Start Copy In back from a CREATE FUNCTION command. This would not be true of nearly any other client that one might use to define a function. pgAdmin and other GUI tools all have code that knows perfectly well when they're defining a function, and it would take nontrivial surgery to make them able to use a COPY-based function definition mode. People seem to think that psql is a representative client. It isn't. regards, tom lane
Andreas Pflug <pgadmin@pse-consulting.de> writes: > Why not regard anything between "AS" and LANGUAGE as source? That seems quite impractical, considering that the function body may be in a language that has little in common with SQL even at the lexical level. The SQL parser couldn't even tell what was a comment or quoted string with any reliability. > This would > prevent the usage of LANGUAGE as a keyword inside the function > definition, but I can't think of any senseful other use anyway. Would you be unhappy if "language" appearing in what you think is a comment in the function body caused a failure? How about "language" in what you think is a quoted literal string? Another problem is that the CREATE FUNCTION syntax doesn't insist on any particular ordering of the attributes; there's nothing saying that LANGUAGE must be the first thing after the AS clause. regards, tom lane
Tom Lane wrote: >Andreas Pflug <pgadmin@pse-consulting.de> writes: > > >>Why not regard anything between "AS" and LANGUAGE as source? >> >> > >That seems quite impractical, considering that the function body may be >in a language that has little in common with SQL even at the lexical >level. The SQL parser couldn't even tell what was a comment or quoted >string with any reliability. > > > >>This would >>prevent the usage of LANGUAGE as a keyword inside the function >>definition, but I can't think of any senseful other use anyway. >> >> > >Would you be unhappy if "language" appearing in what you think is a >comment in the function body caused a failure? > The parser would have to know about comments. >How about "language" >in what you think is a quoted literal string? > If it's quoted, it may contain anything without interfering, no? >Another problem is that the CREATE FUNCTION syntax doesn't insist on any >particular ordering of the attributes; there's nothing saying that >LANGUAGE must be the first thing after the AS clause. > This is so at the moment, while all samples use AS .... LANGUAGE. This could stay free format if quoted as usual, and require it strict if the new format is used. Initially, I didn't want to suggest an additional keyword like ENDFUNC or ENDAS, but now I do. If somebody insists on using such a keyword in the function definition body, he could fall back to the quoted format. Regards, Andreas
Andreas Pflug <pgadmin@pse-consulting.de> writes: > Tom Lane wrote: >> Another problem is that the CREATE FUNCTION syntax doesn't insist on any >> particular ordering of the attributes; >> > This is so at the moment, while all samples use AS .... LANGUAGE. This > could stay free format if quoted as usual, and require it strict if the > new format is used. Considering that someone earlier in this thread was specifically asking to put LANGUAGE before the function body (apparently unaware that he already could), I doubt we can get away with removing that flexibility. > Initially, I didn't want to suggest an additional > keyword like ENDFUNC or ENDAS, but now I do. Once you say that, there's not a lot of distance to letting psql do it with \beginlit ... \endlit (or some other yet-to-be-chosen names). regards, tom lane
Tom Lane wrote: >Considering that someone earlier in this thread was specifically asking >to put LANGUAGE before the function body (apparently unaware that he >already could), I doubt we can get away with removing that flexibility. > > I didn't consider *removing* flexibility, but *adding* flexibility by allowing the function body after AS (which already is required to be right in front of the func def) not to be embraced by quotes. >>Initially, I didn't want to suggest an additional >>keyword like ENDFUNC or ENDAS, but now I do. >> >> >Once you say that, there's not a lot of distance to letting psql do it >with \beginlit ... \endlit (or some other yet-to-be-chosen names). > But we're not talking about psql only (and \beginlit and \endlit are quite ugly sql keywords), and this should be a backend thing only (I hope you won't suggest that pgAdmin3 should be a frontend for psql :-) I don't think that if both current and yet-to-be-decided syntax is allowed, a keyword embraced function definition would do any harm. In the (unlikely) case somebody uses that keyword inside the function, the parser will probably notice this instead of silently create a broken function. Regards, Andreas
Tom Lane wrote: >Considering that someone earlier in this thread was specifically asking >to put LANGUAGE before the function body (apparently unaware that he >already could), I doubt we can get away with removing that flexibility. > > > That was me, and I'm glad to see you can do it. I guess I should read docs more carefully. The reason is this: if I'm reading a long file I want a jogger to my brain at the top of the function saying 'OK now switch into language x' rather than having to scan to the bottom (which might be out of sight, even) to find out what I'm reading. I certainly don't want to lose this. >>Initially, I didn't want to suggest an additional >>keyword like ENDFUNC or ENDAS, but now I do. >> >> > >Once you say that, there's not a lot of distance to letting psql do it >with \beginlit ... \endlit (or some other yet-to-be-chosen names). > > It's not very pretty, but I at least could live with it. cheers andrew
Tom Lane wrote: >psql may indeed work with no changes, because it's too stupid to know >when you're defining a function, and so it would not be surprised to get >a Start Copy In back from a CREATE FUNCTION command. This would not be >true of nearly any other client that one might use to define a function. >pgAdmin and other GUI tools all have code that knows perfectly well when >they're defining a function, and it would take nontrivial surgery to >make them able to use a COPY-based function definition mode. > Frontends would not be forced to use something that triggered this - after all we have to stay backwards compatible and support the existing inline string, no? cheers andrew
Andreas Pflug <pgadmin@pse-consulting.de> writes: > But we're not talking about psql only (and \beginlit and \endlit are > quite ugly sql keywords), and this should be a backend thing only I disagree with both of those assertions. psql is the issue because other clients used for administrative work (pgAdmin etc) already have their own solutions to function body editing. A COPY-based design isn't going to make life better for them. Since psql is the issue, a backend hack isn't the answer. I'll agree that I've not seen a really pleasing idea for the psql backslash command names, but I'm open to suggestions... and surely an invented SQL keyword is not any prettier than an invented psql backslash command. > (I hope you won't suggest that pgAdmin3 should be a frontend for psql :-) No, I didn't suggest that. regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes: > Frontends would not be forced to use something that triggered this - > after all we have to stay backwards compatible and support the existing > inline string, no? Certainly. I didn't see anyone proposing to break backwards compatibility. regards, tom lane
I assume we never came to a final conclusion on how to do CREATE FUNCTION without double-quoting. --------------------------------------------------------------------------- Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > In that case, though, the solution will presumably look at least a bit > > different from those discussed so far in this thread. Or would you have > > psql detect that in place of a string there was "stdin" or whatever and > > then replace it with the inline string before passing it to the backend? > > Please see the archives. I think that what was being discussed was > something along the lines of > > foo=> CREATE FUNCTION myfunc(...) RETURNS ... AS > foo=> \beginliteral > foo'> type my function definition here > foo'> and here > foo'> \endliteral > foo-> LANGUAGE plpgsql; > > and psql would proceed to quotify whatever you entered between > the two backslash commands. (Notice this could be used for any > string-literal entry problem, not only CREATE FUNCTION.) I'm fuzzy on > the details though; this may not have been the best idea presented. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
I think if it could be done in a reasonably aesthetic way in psql that would satisfy many people, without any need to disturb the backend, which Tom objects to. That's a big "if", IMNSHO :-). I'd hate to see this dropped, though cheers andrew Bruce Momjian wrote: >I assume we never came to a final conclusion on how to do CREATE >FUNCTION without double-quoting. > >--------------------------------------------------------------------------- > >Tom Lane wrote: > > >>Andrew Dunstan <andrew@dunslane.net> writes: >> >> >>>In that case, though, the solution will presumably look at least a bit >>>different from those discussed so far in this thread. Or would you have >>>psql detect that in place of a string there was "stdin" or whatever and >>>then replace it with the inline string before passing it to the backend? >>> >>> >>Please see the archives. I think that what was being discussed was >>something along the lines of >> >> foo=> CREATE FUNCTION myfunc(...) RETURNS ... AS >> foo=> \beginliteral >> foo'> type my function definition here >> foo'> and here >> foo'> \endliteral >> foo-> LANGUAGE plpgsql; >> >>and psql would proceed to quotify whatever you entered between >>the two backslash commands. (Notice this could be used for any >>string-literal entry problem, not only CREATE FUNCTION.) I'm fuzzy on >>the details though; this may not have been the best idea presented. >> >> regards, tom lane >> >>---------------------------(end of broadcast)--------------------------- >>TIP 8: explain analyze is your friend >> >> >> > > >
Bruce Momjian wrote: >I assume we never came to a final conclusion on how to do CREATE >FUNCTION without double-quoting. > >--------------------------------------------------------------------------- > Many discussions, but no final conclusion in sight, it seems. That \beginliteral stuff is psql centric, where a sql syntax solution is needed. Regards, Andreas
On Wed, Sep 10, 2003 at 07:04:13PM +0200, Andreas Pflug wrote: > Bruce Momjian wrote: > > >I assume we never came to a final conclusion on how to do CREATE > >FUNCTION without double-quoting. > Many discussions, but no final conclusion in sight, it seems. That > \beginliteral stuff is psql centric, where a sql syntax solution is needed. Oh, is it? Didn't people agree that other frontends (pgAdmin, phpPgAdmin, etc) have solutions for the problem already? -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "In a specialized industrial society, it would be a disaster to have kids running around loose." (Paul Graham)
Alvaro Herrera wrote: >On Wed, Sep 10, 2003 at 07:04:13PM +0200, Andreas Pflug wrote: > > >>Bruce Momjian wrote: >> >> >> >>>I assume we never came to a final conclusion on how to do CREATE >>>FUNCTION without double-quoting. >>> >>> > > > >>Many discussions, but no final conclusion in sight, it seems. That >>\beginliteral stuff is psql centric, where a sql syntax solution is needed. >> >> > >Oh, is it? Didn't people agree that other frontends (pgAdmin, >phpPgAdmin, etc) have solutions for the problem already? > > > I think there is agreement that these do. It would still look ugly in a programmatic interface like JDBC. Not that I use JDBC to set up functions, but I can imagine someone wanting to. But personally I could live with a nice enough psql-only fix. cheers andrew
> >On Wed, Sep 10, 2003 at 07:04:13PM +0200, Andreas Pflug wrote: > > > >>>I assume we never came to a final conclusion on how to do CREATE > >>>FUNCTION without double-quoting. > > > >>Many discussions, but no final conclusion in sight, it seems. That > >>\beginliteral stuff is psql centric, where a sql syntax solution is needed. > > > >Oh, is it? Didn't people agree that other frontends (pgAdmin, > >phpPgAdmin, etc) have solutions for the problem already? I would really prefer a general SQL block quoting mechanism. Although I can use Perl to escape all the quotes in a function block, it'd be really nice to be able to do everything in SQL. May I bring up here documents again? They have the advantage over the COPY-like mechanism of being general, e.g.: INSERT INTO sometable (field1, field2) VALUES (1234, <<EOF A really long text block's place in the world EOF ); as well as being very nice in a function definition. What do others think of that? Jon
Andrew Dunstan wrote: > Alvaro Herrera wrote: > >> On Wed, Sep 10, 2003 at 07:04:13PM +0200, Andreas Pflug wrote: >> >> >>> Bruce Momjian wrote: >>> >>> >>> >>>> I assume we never came to a final conclusion on how to do CREATE >>>> FUNCTION without double-quoting. >>>> >>> >> >> >> >>> Many discussions, but no final conclusion in sight, it seems. That >>> \beginliteral stuff is psql centric, where a sql syntax solution is >>> needed. >>> >> >> >> Oh, is it? Didn't people agree that other frontends (pgAdmin, >> phpPgAdmin, etc) have solutions for the problem already? >> >> >> > I think there is agreement that these do. It would still look ugly in > a programmatic interface like JDBC. Not that I use JDBC to set up > functions, but I can imagine someone wanting to. But personally I > could live with a nice enough psql-only fix. I never agreed that a client solution would be satisfying. While frontends might try to hide some uglyness of the syntax to the user for single functions, editing large scripts with many functions is still suffering from "massive quotes". Regards, Andreas
On Wed, Sep 10, 2003 at 10:35:18PM +0200, Andreas Pflug wrote: > I never agreed that a client solution would be satisfying. While > frontends might try to hide some uglyness of the syntax to the user for > single functions, editing large scripts with many functions is still > suffering from "massive quotes". Oh, and you will be feeding those script to the backend through what? -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Granting software the freedom to evolve guarantees only different results, not better ones." (Zygo Blaxell)
Alvaro Herrera wrote: >On Wed, Sep 10, 2003 at 10:35:18PM +0200, Andreas Pflug wrote: > > > >>I never agreed that a client solution would be satisfying. While >>frontends might try to hide some uglyness of the syntax to the user for >>single functions, editing large scripts with many functions is still >>suffering from "massive quotes". >> >> > >Oh, and you will be feeding those script to the backend through what? > > > Virtually any client. May be psql, or may be pgAdmin2/pgAdmin3 (the latter featuring syntax highlighting), or other tools the let you execute generic queries. While I'm an old "command liner", I rarely use cmd line tools for db administration/querying. Regards, Andreas
On Wed, 10 Sep 2003, Alvaro Herrera wrote: > On Wed, Sep 10, 2003 at 10:35:18PM +0200, Andreas Pflug wrote: > > > I never agreed that a client solution would be satisfying. While > > frontends might try to hide some uglyness of the syntax to the user for > > single functions, editing large scripts with many functions is still > > suffering from "massive quotes". > > Oh, and you will be feeding those script to the backend through what? I don't know what he'd be using, but I use Perl/DBI for things like that. Sure, I could spawn psql instances, but it's a lot less efficient and is quite different from using DBI directly. Jon
Jon Jensen <jon@endpoint.com> writes: > On Wed, 10 Sep 2003, Alvaro Herrera wrote: > > > On Wed, Sep 10, 2003 at 10:35:18PM +0200, Andreas Pflug wrote: > > > > > I never agreed that a client solution would be satisfying. While > > > frontends might try to hide some uglyness of the syntax to the user for > > > single functions, editing large scripts with many functions is still > > > suffering from "massive quotes". > > > > Oh, and you will be feeding those script to the backend through what? > > I don't know what he'd be using, but I use Perl/DBI for things like that. > Sure, I could spawn psql instances, but it's a lot less efficient and is > quite different from using DBI directly. Could the function bodies be shipped over using the new FE protocol as parameters? That would eliminate the quoting and simplify matters for DBI and other drivers as well. Then we just need a generic interface in plsql to handle passing parameters using the new FE protocol. This would help not only when defining function bodies but also when doing inserts/updates of large pieces of text. -- greg
Greg Stark wrote: > >Could the function bodies be shipped over using the new FE protocol as >parameters? That would eliminate the quoting and simplify matters for DBI and >other drivers as well. > > Oh no, not this discussion again. A whole script containing any number of valid statements must be executable without interpreting the script. Regards, Andreas
On Thu, Sep 11, 2003 at 01:05:47AM +0200, Andreas Pflug wrote: > Greg Stark wrote: > > >Could the function bodies be shipped over using the new FE protocol as > >parameters? That would eliminate the quoting and simplify matters for DBI > >and other drivers as well. > Oh no, not this discussion again. :-) > A whole script containing any number of valid statements must be > executable without interpreting the script. Yes, but executable by what? You said you are a command liner. Then you will probably want to "execute" the script using psql. Then the proposed solution is fine, because the \beginliteral and \endliteral will be interpreted correctly. Now, you also said you wanted to use pgAdmin to administer the database. Is it able to execute the complete script, or you have to fiddle around with the mouse? If the latter, then there's no point in trying to "execute" the script; and I suppose pgAdmin is already capable of taking an non-massively-quoted function body and quote it correctly before passing the CREATE FUNCTION to the server. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Et put se mouve" (Galileo Galilei)
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > On Thu, Sep 11, 2003 at 01:05:47AM +0200, Andreas Pflug wrote: > > > A whole script containing any number of valid statements must be > > executable without interpreting the script. How is that relevant? It's still parseable with parameter placeholders in place of literal parameters. > Yes, but executable by what? You said you are a command liner. Then > you will probably want to "execute" the script using psql. Then the > proposed solution is fine, because the \beginliteral and \endliteral > will be interpreted correctly. Presumably \beginliteral \endliteral would be psql's way of specifying parameters to ship over as parameters. > Now, you also said you wanted to use pgAdmin to administer the database. > Is it able to execute the complete script, or you have to fiddle around > with the mouse? If the latter, then there's no point in trying to > "execute" the script; and I suppose pgAdmin is already capable of taking > an non-massively-quoted function body and quote it correctly before > passing the CREATE FUNCTION to the server. It probably is, but that's not what I was thinking of. I was thinking it wouldn't have to poke around inside the string at all, it would pass it as an out-of-band parameter using the new FE protocol. This helps with DBI too, since you can already do that. $dbh->do("CREATE FUNCTION foo as ? LANGUAGE SQL", $func); Is a whole lot cleaner for the front-end to do than trying to quote the parameters and interpolate them into a query. -- greg
Andreas Pflug wrote: > > I never agreed that a client solution would be satisfying. While > frontends might try to hide some uglyness of the syntax to the user > for single functions, editing large scripts with many functions is > still suffering from "massive quotes". > Yes, I agree that a psql-specific solution is not ideal, quite aside from the problem of making it look nice. It would be better than nothing, though. Something that is done at the language level will be portable across frontends, while something psql-specific will not. Also, pg_dump -s will kindly restore all the quotes for you, so if you ever edit its output (as I do sometimes) you'll have to convert stuff all over again, although I guess pg_dump could be taught to re-unescape things. But then that seems almost as much trouble as teaching the backend a bit of new syntax. For those reasons as well as the aesthetic ones I'd prefer a solution at the language level. cheers andrew
Jon Jensen <jon@endpoint.com> writes: > On Wed, 10 Sep 2003, Alvaro Herrera wrote: > > > On Wed, Sep 10, 2003 at 10:35:18PM +0200, Andreas Pflug wrote: > > > > > I never agreed that a client solution would be satisfying. While > > > frontends might try to hide some uglyness of the syntax to the user for > > > single functions, editing large scripts with many functions is still > > > suffering from "massive quotes". > > > > Oh, and you will be feeding those script to the backend through what? > > I don't know what he'd be using, but I use Perl/DBI for things like that. > Sure, I could spawn psql instances, but it's a lot less efficient and is > quite different from using DBI directly. But Perl/DBI does escaping for you, so all you'd have to do is: $sth = $dbh->prepare ("CREATE FUNCTION foo(x text) RETURNS text AS ? LANGUAGE 'plpgsql'"); $sth->execute($function_body); where $function_body is the unescaped form of the function. So there's no need for a COPY-style mechanism, you can use the current CREATE FUNCTION syntax without having to escape everything yourself. The same argument applies to JDBC. -Doug
Andreas Pflug <pgadmin@pse-consulting.de> writes: > Bruce Momjian wrote: >> I assume we never came to a final conclusion on how to do CREATE >> FUNCTION without double-quoting. > Many discussions, but no final conclusion in sight, it seems. That > \beginliteral stuff is psql centric, where a sql syntax solution is needed. Some people think a "sql syntax solution" is needed, and some do not. regards, tom lane
Tom Lane wrote: >Andreas Pflug <pgadmin@pse-consulting.de> writes: > > >>Bruce Momjian wrote: >> >> >>>I assume we never came to a final conclusion on how to do CREATE >>>FUNCTION without double-quoting. >>> >>> > > > >>Many discussions, but no final conclusion in sight, it seems. That >>\beginliteral stuff is psql centric, where a sql syntax solution is needed. >> >> > >Some people think a "sql syntax solution" is needed, and some do not. > > > So does this get resolved by a vote? cheers andrew
>How is that relevant? It's still parseable with parameter placeholders in >place of literal parameters. > *NO PARSING* The script must be stuffable into PQexec in total, backend does the rest. > > >Presumably \beginliteral \endliteral would be psql's way of specifying >parameters to ship over as parameters. > Again: not psql, but sql language itself must provide this. >It probably is, but that's not what I was thinking of. I was thinking it >wouldn't have to poke around inside the string at all, it would pass it as an >out-of-band parameter using the new FE protocol. > No out-of-band, because this would require splitting the script in pieces. > >This helps with DBI too, since you can already do that. > >$dbh->do("CREATE FUNCTION foo as ? LANGUAGE SQL", $func); > There may be a huge number of CREATE FUNCTION within the same script! Regards, Andreas
Andrew Dunstan wrote: > > > Tom Lane wrote: > >> Some people think a "sql syntax solution" is needed, and some do not. >> > So does this get resolved by a vote? > A "non-sql-syntax solution" is useless. Regards, Andreas
On Thursday 11 September 2003 09:33, Andreas Pflug wrote: > *NO PARSING* > The script must be stuffable into PQexec in total, backend does the rest. > > Again: not psql, but sql language itself must provide this. > > No out-of-band, because this would require splitting the script in pieces. What's wrong with re-using the COPY FROM format? CREATE FUNCTION foo(int4) RETURNS int4 BODY FROM stdin LANGUAGE 'plpgsql'; BEGIN RAISE NOTICE 'param %',$1; RETURN $1; END; \. The only real change is that the body of your function comes after the LANGUAGE specification. Obviously, this is a special case where we are only looking for one value, so suppress delimiters on \t and \n. If that's not liked then something like Perl's qq[] quoting system, which we could use anywhere. Since the options boil down to: 1. Add some syntax to the SQL parser 2. Rely on client capabilities 3. Add syntax to "privileged" client - psql I'll volunteer to help out if people decide #1 is the answer. The only places this quoting seems to be a real issue are when defining functions, so the COPY FROM format above seems to be adequate. Is there anywhere else people need to quote large blocks of text that will almost certainly contain other quotes? -- Richard Huxton Archonet Ltd
Richard Huxton wrote: >On Thursday 11 September 2003 09:33, Andreas Pflug wrote: > > >>*NO PARSING* >>The script must be stuffable into PQexec in total, backend does the rest. >> >>Again: not psql, but sql language itself must provide this. >> >>No out-of-band, because this would require splitting the script in pieces. >> >> > >What's wrong with re-using the COPY FROM format? > I must be writing a completely ununderstandable english, sorry for that. *NO PARSING*, I don't know how to express this differently. What I'm saying all the time that a single sql script file containing a huge number of statements including function creation must be executable without parsing it, splitting it into parts and stuffing the data into different functions. When creating a new instance of a database for an app, I don't create each object one by one interactively, but use a script for this. To create the script, I certainly won't use psql and some arbitrary editor, but a frontend allowing for editing and executing that script. Regards, Andreas
On Thu, 10 Sep 2003, Doug McNaught wrote: > But Perl/DBI does escaping for you, so all you'd have to do is: > > $sth = $dbh->prepare > ("CREATE FUNCTION foo(x text) RETURNS text AS ? LANGUAGE 'plpgsql'"); > $sth->execute($function_body); > > where $function_body is the unescaped form of the function. So > there's no need for a COPY-style mechanism, you can use the current > CREATE FUNCTION syntax without having to escape everything yourself. Right; I'm not saying there's no reasonable way to deal with it in DBI right now. I think the biggest benefit to a block-string quoting mechanism is for scripts to feed to psql, but would prefer having a consistent SQL solution that I could use when desired in libpq/DBI for times I build the entire SQL statement in a string and call it with do(). Jon
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> Some people think a "sql syntax solution" is needed, and some do not. >> > So does this get resolved by a vote? A vote is a little premature when we don't have fully-developed alternatives to choose from. The psql "literal" proposal is fairly squishy around the edges (no one's written down an exact spec), while the camp that wants a sql-syntax solution has not gotten further than wishing. The COPY camp is touting their answer as applicable to more than psql, but I haven't seen any explanation of exactly how it would be useful (or better than existing approaches) in non-psql frontends. The only specific use-case that's been presented for it is psql scripts. The discussion so far today seems to be entirely a rehash of arguments already made (and in many cases already rebutted). Rather than wasting list bandwidth with this, I think each camp ought to go off and do their homework. Give us *details* of how your solution would work. regards, tom lane
On Thursday 11 September 2003 10:40, Andreas Pflug wrote: > Richard Huxton wrote: > >On Thursday 11 September 2003 09:33, Andreas Pflug wrote: > >>*NO PARSING* > >>The script must be stuffable into PQexec in total, backend does the rest. > >> > >>Again: not psql, but sql language itself must provide this. > >> > >>No out-of-band, because this would require splitting the script in > >> pieces. > > > >What's wrong with re-using the COPY FROM format? > > I must be writing a completely ununderstandable english, sorry for that. > > *NO PARSING*, I don't know how to express this differently. I think it's my problem - I thought COPY mytable FROM source_file *was* something handled by the SQL parser in the backend, i.e. you could use it via psql / DBI / jdbc / whatever. As it turns out, if the source file is an actual file (/tmp/foo.sql) then you can, but not if it's "stdin". Presumably, there's some magic going on if you use psql/pg_restore (NOTE-should this be mentioned in the docs). > What I'm saying all the time that a single sql script file containing a > huge number of statements including function creation must be executable > without parsing it, splitting it into parts and stuffing the data into > different functions. > > When creating a new instance of a database for an app, I don't create > each object one by one interactively, but use a script for this. To > create the script, I certainly won't use psql and some arbitrary editor, > but a frontend allowing for editing and executing that script. I agree completely with you - a single file containing statements that I can run into PG via any convenient connection, not something that works one way and not another. -- Richard Huxton Archonet Ltd
Jon Jensen <jon@endpoint.com> writes: > On Thu, 10 Sep 2003, Doug McNaught wrote: > > > But Perl/DBI does escaping for you, so all you'd have to do is: Only because the FE protocol is new and the DBD driver hasn't switched to using it. > > $sth = $dbh->prepare > > ("CREATE FUNCTION foo(x text) RETURNS text AS ? LANGUAGE 'plpgsql'"); > > $sth->execute($function_body); > > > > where $function_body is the unescaped form of the function. So > > there's no need for a COPY-style mechanism, you can use the current > > CREATE FUNCTION syntax without having to escape everything yourself. Well that will only work for as long as DBD actually does do the quoting and interpolating. Presumably soon the driver will be converted to the new FE protocol and ship the parameter out-of-band. Will the CREATE FUNCTION handle a string argument shipping separately like this? If so then all that has to happen is psql has to have a syntax that allows the user to specify parameters. Something like CREATE FUNCTION foo(text) RETURNS text as ? LANGUAGE 'plpgsql' $1<<EOF... EOF then plsql would be able to read in the parameters into buffers without having to dig inside looking for quotes. And execute the query passing the parameters. It could even support alternate sources of data for parameters: CREATE FUNCTION foo(text) RETURNS text as ? LANGUAGE 'plpgsql' $1<'foo.func' ; Another nice thing about this is that it would help not just psql, but any front-end using any driver that supports the new FE protocol. It would also help any other query you want to do with big text parameters. For example: INSERT INTO message (header,body) values (?,?) $1<<EOF From: foo@bar.baz EOF $2<<EOF Big long message EOF ; I don't see any advantage to inventing a new quoting syntax for sql. In fact doing it in sql would only increase the amount of parsing psql and other front-ends would have to do and limit future options. They would still have to parse to find the end of the statement which is the same parsing they have to do to pass the arguments as separate parameters. There's a security issue here too. If the data is already available in alternate storage, such as in an external file or in a separate variable then the last thing you want to have to do is interpolate the data into the sql query only to have the backend parse it out all over again. One bug in the interpolation or the parsing and you have a security hole. Consider what happens if you do the above query but somebody passes a text of: " foo EOF ; DELETE FROM message ; " If the front-end is shipping it over to the backend whole the backend will parse it and execute the DELETE statement. If the front-end is shipping it over as parameters and receives this from a file or from a separate variable, then it will be inserted as text into the table. -- greg
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > Tom Lane wrote: > >> Some people think a "sql syntax solution" is needed, and some do not. > >> > > So does this get resolved by a vote? > > A vote is a little premature when we don't have fully-developed > alternatives to choose from. The psql "literal" proposal is fairly > squishy around the edges (no one's written down an exact spec), while > the camp that wants a sql-syntax solution has not gotten further than > wishing. The COPY camp is touting their answer as applicable to more > than psql, but I haven't seen any explanation of exactly how it would be > useful (or better than existing approaches) in non-psql frontends. The > only specific use-case that's been presented for it is psql scripts. > > The discussion so far today seems to be entirely a rehash of arguments > already made (and in many cases already rebutted). Rather than wasting > list bandwidth with this, I think each camp ought to go off and do their > homework. Give us *details* of how your solution would work. Another idea would be to enable another set of quoting characters, like: CREATE FUNCTION xx ...<-- x = 'fred'; ...--> and have the lexer understand those new quoting characters. We just use '' too much in function bodies to use that also for quoting the function text. Of course, '<--' would have no special meaning inside a quoted string, so we are only eliminating their use as custom operators, and I think that is reasonable. Having heard all the other proposals, I think this will be the clearest. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: >Tom Lane wrote: > > >>The discussion so far today seems to be entirely a rehash of arguments >>already made (and in many cases already rebutted). Rather than wasting >>list bandwidth with this, I think each camp ought to go off and do their >>homework. Give us *details* of how your solution would work. >> >> > >Another idea would be to enable another set of quoting characters, like: > > CREATE FUNCTION xx ... > <-- > x = 'fred'; > ... > --> > >and have the lexer understand those new quoting characters. We just use >'' too much in function bodies to use that also for quoting the function >text. Of course, '<--' would have no special meaning inside a quoted >string, so we are only eliminating their use as custom operators, and I >think that is reasonable. > >Having heard all the other proposals, I think this will be the clearest. > > > This looks quite similar to my proposal. I called it "function body is enclosed in keywords", while Bruce will enclose it in new quote strings. This is obviously very different for the lexer/parser, while identical for the user. Sounds good to me. The quoting strings seem a bit suspicious to me, I can imagine comments like this "converting abc --> def", which would certainly break the function definition (I scanned a part of my sources, I found two occurrences of "-->", one of them in SQL code...) How about quoting strings that look like keywords, e.g. FUNCTIONBODY and ENDFUNCTIONBODY? CREATE FUNCTION foo() RETURNS int4 AS FUNCTIONBODY DECLARE bar int4; BEGIN RETURN bar; END; ENDFUNCTIONBODY LANGUAGE 'plpgsql'; Regards, Andreas Regards, Andreas
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Another idea would be to enable another set of quoting characters, like: Yeah, I was toying with that also; it would be nearly the same as the psql literal proposal, but pushed into the backend. I am not sure what the quoting symbols should look like though. '<--' will not do, it conflicts with SQL comments. Another issue that no one has really addressed is that all these proposals only solve the basic need to double quotes and backslashes in function bodies. Yes, that would be a huge step forward, but if you look at the situations where the plpgsql quoting recipe recommends six or eight or ten quotes in a row, it's still gonna be messy. Seems like you might want another layer of "special quoting" within plpgsql to aid in constructing dynamic SQL commands. Can that be handled with the same notation, or do we need another one? This line of thought suggests that we might want a convention that supports multiple end-markers, like shell here-documents do, so that you can nest uses of special quoting. Perhaps we could do something like this: a special quoted string is started by a line like$$FOO where FOO represents any string of uppercase ASCII letters (we allow only A-Z here, and nothing else not even whitespace on the line, so as to minimize the potential for conflicts). Then the string extends to the first exactly-matching line. This would allow nesting: $$FOO...$$BAR...$$BAR...$$FOO I'm not by any means wedded to this particular syntax, but it came to mind as unlikely to conflict with any existing or planned SQL notations. regards, tom lane
Andreas Pflug wrote: > Bruce Momjian wrote: > > >Tom Lane wrote: > > > > > >>The discussion so far today seems to be entirely a rehash of arguments > >>already made (and in many cases already rebutted). Rather than wasting > >>list bandwidth with this, I think each camp ought to go off and do their > >>homework. Give us *details* of how your solution would work. > >> > >> > > > >Another idea would be to enable another set of quoting characters, like: > > > > CREATE FUNCTION xx ... > > <-- > > x = 'fred'; > > ... > > --> > > > >and have the lexer understand those new quoting characters. We just use > >'' too much in function bodies to use that also for quoting the function > >text. Of course, '<--' would have no special meaning inside a quoted > >string, so we are only eliminating their use as custom operators, and I > >think that is reasonable. > > > >Having heard all the other proposals, I think this will be the clearest. > > > > > > > This looks quite similar to my proposal. I called it "function body is > enclosed in keywords", while Bruce will enclose it in new quote strings. > This is obviously very different for the lexer/parser, while identical > for the user. Sounds good to me. > The quoting strings seem a bit suspicious to me, I can imagine comments > like this "converting abc --> def", which would certainly break the > function definition (I scanned a part of my sources, I found two > occurrences of "-->", one of them in SQL code...) > > How about quoting strings that look like keywords, e.g. FUNCTIONBODY and > ENDFUNCTIONBODY? > > CREATE FUNCTION foo() RETURNS int4 AS > FUNCTIONBODY > DECLARE bar int4; > BEGIN > RETURN bar; > END; > ENDFUNCTIONBODY > LANGUAGE 'plpgsql'; Uh, the problem with long keywords is that you are then requiring the _parser_ to identify those keywords, and at that point, the entire text between the keywords has been sliced up by the lexer, which will certainly make it a mess. I might be wrong that we can even use more then two characters for the start of quote string because I don't think flex supports more than one character of look-ahead. Again, lookahead is the key because you have to flag that text as a quoted string before it gets processed by the lexer. I was picking <-- --> out of the air. We would have to choose other character sequences, and probably only two-character ones. However, looking at scan.l, it seems you could try putting a literal keyword in there. Of course, it would also match 'identifier' code, but if it is earlier, I think it gets matched first. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Uh, the problem with long keywords is that you are then requiring the > _parser_ to identify those keywords, and at that point, the entire text > between the keywords has been sliced up by the lexer, which will > certainly make it a mess. I might be wrong that we can even use more > then two characters for the start of quote string You're wrong. We can use anything we like for the start of the quote string; flex is quite capable of recognizing fixed strings, and even variable ones. I'd prefer to avoid expecting it to handle up/downcasing, I think, but there is no technical reason that the delimiter couldn't look like a keyword. My objection to the proposal FUNCTIONBODY is that it makes it look like the feature is only useful in CREATE FUNCTION. In point of fact, the quoting facility could be used to construct any SQL string literal. The comparison points I am thinking about are shell here-documents and Perl quoting conventions, both of which are used for many things. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Uh, the problem with long keywords is that you are then requiring the > > _parser_ to identify those keywords, and at that point, the entire text > > between the keywords has been sliced up by the lexer, which will > > certainly make it a mess. I might be wrong that we can even use more > > then two characters for the start of quote string > > You're wrong. We can use anything we like for the start of the quote > string; flex is quite capable of recognizing fixed strings, and even > variable ones. I'd prefer to avoid expecting it to handle > up/downcasing, I think, but there is no technical reason that the > delimiter couldn't look like a keyword. > > My objection to the proposal FUNCTIONBODY is that it makes it look like > the feature is only useful in CREATE FUNCTION. In point of fact, the > quoting facility could be used to construct any SQL string literal. The > comparison points I am thinking about are shell here-documents and > Perl quoting conventions, both of which are used for many things. Sounds good. I just think keywords in general are weird to use for quoting. We use "'" for quoting, so something similar like another operator combination would be nice. I have never been fond of the here-document approach, though I can see the value of doing here-documents in here-documents, though if we make the open and close of the quote string different <--, -->, we can still do that, no? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: >Bruce Momjian <pgman@candle.pha.pa.us> writes: > > >>Uh, the problem with long keywords is that you are then requiring the >>_parser_ to identify those keywords, and at that point, the entire text >>between the keywords has been sliced up by the lexer, which will >>certainly make it a mess. I might be wrong that we can even use more >>then two characters for the start of quote string >> >> > >You're wrong. We can use anything we like for the start of the quote >string; flex is quite capable of recognizing fixed strings, and even >variable ones. I'd prefer to avoid expecting it to handle >up/downcasing, I think, but there is no technical reason that the >delimiter couldn't look like a keyword. > >My objection to the proposal FUNCTIONBODY is that it makes it look like >the feature is only useful in CREATE FUNCTION. In point of fact, the >quoting facility could be used to construct any SQL string literal. The >comparison points I am thinking about are shell here-documents and >Perl quoting conventions, both of which are used for many things. > > Anyway, a viable solution seems to be very near, the general direction is quite clear. Problem with pure literal quote strings is that they couldn't be immediately adjacent to literal strings, so they should start end end with special chars: $QUOTE1$This is my string$ENDQUOTE1$ Regards, Andreas
Andreas Pflug wrote: > >My objection to the proposal FUNCTIONBODY is that it makes it look like > >the feature is only useful in CREATE FUNCTION. In point of fact, the > >quoting facility could be used to construct any SQL string literal. The > >comparison points I am thinking about are shell here-documents and > >Perl quoting conventions, both of which are used for many things. > > > > > Anyway, > a viable solution seems to be very near, the general direction is quite > clear. > Problem with pure literal quote strings is that they couldn't be > immediately adjacent to literal strings, so they should start end end > with special chars: > > $QUOTE1$This is my string$ENDQUOTE1$ You mean if the special quotes are <-- and -->, <-----> would be the same as '-'? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: >You mean if the special quotes are <-- and -->, <-----> would be the >same as '-'? > > If they work as the standard ' quote (and that seems to be Toms intention), obviously. Besides, we have to care specially about -->. Remember the complaints about select 1--1, behaving differently between MySQL and PostgreSQL. -- should remain comment under all circumstances. Too sad, all special chars are used up for operators.... Regards, Andreas
Andreas Pflug wrote: > Bruce Momjian wrote: > > >You mean if the special quotes are <-- and -->, <-----> would be the > >same as '-'? > > > > > > If they work as the standard ' quote (and that seems to be Toms > intention), obviously. > > Besides, we have to care specially about -->. Remember the complaints about > select 1--1, > behaving differently between MySQL and PostgreSQL. -- should remain > comment under all circumstances. > > Too sad, all special chars are used up for operators.... Oh, I never intended <--, --> to be used --- they were just examples. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Sergio A. Kessler wrote: > > Too sad, all special chars are used up for operators.... also '{' '}' are used ? I've only seen this in ACLs, so it might be usable. Tom, Bruce? Regards, Andreas
Andreas Pflug wrote: > Sergio A. Kessler wrote: > > > > > > Too sad, all special chars are used up for operators.... > > > also '{' '}' are used ? > > I've only seen this in ACLs, so it might be usable. Tom, Bruce? Something that includes "'" would be clearest. I thought of <' and '>, but this would break: if var <'yes' I think {' and '} might work. Arrays are specified as '{val, val}', which is safe because it is opposite of the suggested syntax. I can't think of any case where you would use an opening brace, then a single quote. Interestingly, it looks like a C braces: CREATE FUNCTION test() ...{' x = 'no';'} Of course, this brings up a problem. What if we do: CREATE FUNCTION test() ...{' x = '}text';'} Oops, two closing mega-quotes. One clean way would be to use {' to start a quote, and }' to end it, so we have: CREATE FUNCTION test() ...{' x = 'text';}' which looks even better and this is safe because both braces in '}text}' are seen in a quoted string: CREATE FUNCTION test() ...{' x = '}text}';}' Also, I can't imagine anyone defining those as operators. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: >Something that includes "'" would be clearest. I thought of <' and '>, >but this would break: > > I'm not sure that using a quote is necessarily clearest. But it's a matter of taste. I had thought of {{ and }} as maybe working. [snip] >One clean way would be to use {' to start a quote, and }' to end it, so >we have: > > > CREATE FUNCTION test() ... > {' > x = 'text'; > }' > >which looks even better and this is safe because both braces in '}text}' >are seen in a quoted string: > > CREATE FUNCTION test() ... > {' > x = '}text}'; > }' > >Also, I can't imagine anyone defining those as operators. > > > Quite cute. I like it better than the here-document style. Is this proposed as a general quoting mechanism, or only in the context of "create function"? (I favor a general mechanism, if that matters :-) - I got caught three times in the last 2 weeks with embedded quotes in "comment on" statements.) cheers andrew
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Something that includes "'" would be clearest. I thought of <' and '>, > but this would break: > if var <'yes' People seem to be assuming that this feature needs to be impervious to whitespace and being adjacent to other things. I believe we could make it a good deal more robust if both the opening and closing markers (whatever they are) are required to stand alone on a line. For example {'}' represents the empty string, and {' x = 'text';}' represents '\tx = \'text\';'. I think the rule would need to be that the newline just after the opening marker, and the newline just before the closing marker, are not included in the resulting string literal. The point is we don't have to make case like {'xxx}' be recognized as an occurrence of this construct, and on the whole I think it's a lot safer if we don't. Keep in mind that the string you are trying to quote is often code in a language that is not SQL. Therefore, choosing markers on the grounds that they won't appear in SQL is not good enough. regards, tom lane
On Thu, 11 Sep 2003, Tom Lane wrote: > People seem to be assuming that this feature needs to be impervious to > whitespace and being adjacent to other things. I believe we could make > it a good deal more robust if both the opening and closing markers > (whatever they are) are required to stand alone on a line. For example > > {' > }' > > represents the empty string, and > > {' > x = 'text'; > }' > > represents '\tx = \'text\';'. I think the rule would need to be that > the newline just after the opening marker, and the newline just before > the closing marker, are not included in the resulting string literal. > > The point is we don't have to make case like {'xxx}' be recognized as an > occurrence of this construct, and on the whole I think it's a lot safer > if we don't. Keep in mind that the string you are trying to quote is > often code in a language that is not SQL. Therefore, choosing markers > on the grounds that they won't appear in SQL is not good enough. Agreed. But I think it would be very nice to not invent yet another block quoting mechanism if possible. I don't understand the resistance to here documents, since they solve the problem Tom brought up -- the token will by definition never be a problem because it's alone on a line, and the user chooses it and can avoid problematic ones. Or to combine COPY style (but not its under-the-hood operation) with here documents, why not: INSERT INTO sometable (5, <<\. a very long string \. ); Is there a reason not to use here documents? Jon
Tom Lane wrote: > I believe we could make >it a good deal more robust if both the opening and closing markers >(whatever they are) are required to stand alone on a line. > Hard to detect whitespace might trip things up. I wish I had a $ for every time that has made my life difficult. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> I believe we could make >> it a good deal more robust if both the opening and closing markers >> (whatever they are) are required to stand alone on a line. >> > Hard to detect whitespace might trip things up. I wish I had a $ for > every time that has made my life difficult. I could accept a definition that allowed trailing whitespace on the same line as the marker, but not any other visible characters. regards, tom lane
Jon Jensen <jon@endpoint.com> writes: > Is there a reason not to use here documents? The $$FOO proposal I put forward earlier was consciously modeled on here-documents. We cannot use exactly the shell syntax for here-documents, though, mainly because we already have meaning assigned to strings like <<' (<< is already a standard operator, and the ' could be the start of an ordinary literal). I would definitely like to see us adopt a proposal that is like here-documents to the extent that there's a family of possible terminator markers and not only one. But we'll have to adjust the syntax a little bit. If you don't like $$FOO, what else comes to mind? regards, tom lane
Bruce Momjian wrote: > Sounds good. I just think keywords in general are weird to use for > quoting. We use "'" for quoting, so something similar like another > operator combination would be nice. I have never been fond of the > here-document approach, though I can see the value of doing > here-documents in here-documents, though if we make the open and close > of the quote string different <--, -->, we can still do that, no? > The beauty of here-documents is that you specify your closing tag on a per usage base and can vary that depending on the content you need to enclose. Keep in mind that this literal mechanism is not only used for PL/pgSQL, but for other languages like PL/Tcl and PL/Perl as well. Imagine the pain for a Tcl programmer if we'd go with curly braces! Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>Tom Lane wrote: >> >> >>>I believe we could make >>>it a good deal more robust if both the opening and closing markers >>>(whatever they are) are required to stand alone on a line. >>> >>> >>> >>Hard to detect whitespace might trip things up. I wish I had a $ for >>every time that has made my life difficult. >> >> > >I could accept a definition that allowed trailing whitespace on the same >line as the marker, but not any other visible characters. > > This contradicts a wish you made earlier, using this extended quoting for further reducing quote-quoting. IMHO, such a single-line quote marker would be totally sufficient, it would reduce '''''''' to '''', and if you write \'\' (what I prefer) it gets even more distinguishable (actually, this is the way pgAdmin3's function wizard works). But if we're back to a single line, it's somehow like a keyword, separated by space-chars. FUNCTIONBODY might be a bit function-centric, so how about QUOTE and ENDQUOTE? In case another quote level makes sense, it could be QUOTE(1) and ENDQUOTE(1), or only a pair of QUOTE(n). Anyway, I think such a here-documents or single line quote marker solution would be a great enhancement, details are now merely a question of taste. While this sounds "nearly done" for me, as there seems some fundamental consense, I'll will probably have to wait for 7.5? Regards, Andreas
Jan Wieck wrote: > > > Bruce Momjian wrote: > > > Sounds good. I just think keywords in general are weird to use for > > quoting. We use "'" for quoting, so something similar like another > > operator combination would be nice. I have never been fond of the > > here-document approach, though I can see the value of doing > > here-documents in here-documents, though if we make the open and close > > of the quote string different <--, -->, we can still do that, no? > > > > The beauty of here-documents is that you specify your closing tag on a > per usage base and can vary that depending on the content you need to > enclose. Keep in mind that this literal mechanism is not only used for > PL/pgSQL, but for other languages like PL/Tcl and PL/Perl as well. > Imagine the pain for a Tcl programmer if we'd go with curly braces! Oh, non-SQL languages. OK, I agree, we need HERE documents, and I think we all agree it has to be done in the lexer. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Andreas Pflug wrote: > solution would be a great enhancement, details are now merely a question > of taste. > > While this sounds "nearly done" for me, as there seems some fundamental > consense, I'll will probably have to wait for 7.5? Yes, has to wait for 7.5. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: > Jon Jensen <jon@endpoint.com> writes: > > Is there a reason not to use here documents? > > The $$FOO proposal I put forward earlier was consciously modeled on > here-documents. We cannot use exactly the shell syntax for > here-documents, though, mainly because we already have meaning assigned > to strings like <<' (<< is already a standard operator, and the ' could > be the start of an ordinary literal). > > I would definitely like to see us adopt a proposal that is like > here-documents to the extent that there's a family of possible > terminator markers and not only one. But we'll have to adjust the > syntax a little bit. If you don't like $$FOO, what else comes to mind? Couldn't we allow << at the beginning of the line to mean 'here' document? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > Jan Wieck wrote: >> >> >> Bruce Momjian wrote: >> >> > Sounds good. I just think keywords in general are weird to use for >> > quoting. We use "'" for quoting, so something similar like another >> > operator combination would be nice. I have never been fond of the >> > here-document approach, though I can see the value of doing >> > here-documents in here-documents, though if we make the open and close >> > of the quote string different <--, -->, we can still do that, no? >> > >> >> The beauty of here-documents is that you specify your closing tag on a >> per usage base and can vary that depending on the content you need to >> enclose. Keep in mind that this literal mechanism is not only used for >> PL/pgSQL, but for other languages like PL/Tcl and PL/Perl as well. >> Imagine the pain for a Tcl programmer if we'd go with curly braces! > > Oh, non-SQL languages. OK, I agree, we need HERE documents, and I think > we all agree it has to be done in the lexer. > Not only. psql by default sends the edit buffer when it encounters the ";". You don't want that to happen inside of the here document. So it has to duplicate that logic. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Tom Lane wrote: >> The $$FOO proposal I put forward earlier was consciously modeled on >> here-documents. > Couldn't we allow << at the beginning of the line to mean 'here' document? No; you could easily be breaking existing queries, for example regression=# select f1, f1 << 8 from int4_tbl; f1 | ?column? -------------+----------- 0 | 0 123456 | 31604736 -123456 | -31604736 2147483647 | -256-2147483647| 256 (5 rows) is only one unfortunate choice of line break away from being eaten by such a proposal. I suggested $$ because AFAIK we don't currently have any valid syntax that would allow that to appear at the start of a line (it helps a great deal that we just removed $ from the set of characters allowed in operators ;-)). If you consider my $$FOO proposal to include the possibility of a zero-length FOO string, then the shortest legal alternative would be $$string contents here$$ but adding a string to that reduces the odds of conflict (especially when you consider languages like plperl; IIRC, $$ is something or other useful in Perl). Also you can use mnemonically-chosen strings; maybe Andreas will like $$FUNCTIONBODYtext here$$FUNCTIONBODY regards, tom lane
Jan Wieck wrote: > >> The beauty of here-documents is that you specify your closing tag on a > >> per usage base and can vary that depending on the content you need to > >> enclose. Keep in mind that this literal mechanism is not only used for > >> PL/pgSQL, but for other languages like PL/Tcl and PL/Perl as well. > >> Imagine the pain for a Tcl programmer if we'd go with curly braces! > > > > Oh, non-SQL languages. OK, I agree, we need HERE documents, and I think > > we all agree it has to be done in the lexer. > > > > Not only. psql by default sends the edit buffer when it encounters the > ";". You don't want that to happen inside of the here document. So it > has to duplicate that logic. Yep, good point. It already understands quoting and comments, so it will have to understand this too. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Tom Lane wrote: > >> The $$FOO proposal I put forward earlier was consciously modeled on > >> here-documents. > > > Couldn't we allow << at the beginning of the line to mean 'here' document? > > No; you could easily be breaking existing queries, for example Yes, that is true. I didn't like the beginning-of-line requirement for here documents for that reason. However, we are already requiring the thing to be at the beginning of the line. You are saying it is safer to make it at the beginnning of a line _and_ have it be something that isn't used in SQL, but $$ is used in Perl, so I don't see the big advantage either way --- once you say X has to begin at the beginning of the line, we are already making things breakable by a newline, no? You could make the point that using $$ makes only the text inside the quote as newline-sensitive, while using << makes any SQL newline-sensitive, and that might be the major advantage of something like $$. To me, however, the dollar sign is just too overloaded for function arguments, which you will likely see in the function text. > I suggested $$ because AFAIK we don't currently have any valid syntax > that would allow that to appear at the start of a line (it helps a great > deal that we just removed $ from the set of characters allowed in > operators ;-)). If you consider my $$FOO proposal to include the > possibility of a zero-length FOO string, then the shortest legal > alternative would be > > $$ > string contents here > $$ > > but adding a string to that reduces the odds of conflict (especially > when you consider languages like plperl; IIRC, $$ is something or other > useful in Perl). Also you can use mnemonically-chosen strings; maybe Of course, every operator combination is used by Perl. :-) > Andreas will like > > $$FUNCTIONBODY > text here > $$FUNCTIONBODY So you are requiring the identical text to appear at the beginning and end of the quote, rather than a here document that would be: <<END...END or in your example: $$END...END -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Thu, 11 Sep 2003, Bruce Momjian wrote: > > I would definitely like to see us adopt a proposal that is like > > here-documents to the extent that there's a family of possible > > terminator markers and not only one. But we'll have to adjust the > > syntax a little bit. If you don't like $$FOO, what else comes to mind? > > Couldn't we allow << at the beginning of the line to mean 'here' document? Even if that worked, it diverges so much from shell and Perl here document syntax (where it's customary to have it at the end of the line, or even in the middle), it wouldn't be any more familiar than anything else. I think Tom's $$FOO suggestion is fine -- it just takes getting used to. Jon
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Yes, that is true. I didn't like the beginning-of-line requirement for > here documents for that reason. However, we are already requiring the > thing to be at the beginning of the line. You are saying it is safer to > make it at the beginnning of a line _and_ have it be something that > isn't used in SQL, but $$ is used in Perl, so I don't see the big > advantage either way --- once you say X has to begin at the beginning of > the line, we are already making things breakable by a newline, no? But as Jan pointed out, you can choose your string so as not to conflict with whatever is in the text to be quoted. So you'd probably *not* use plain $$ as marker if working with a plperl function. $$FUNCTION would have a reasonably good chance of not conflicting, and if by some chance it did match a line you want in the text, you pick another. It is possible that we could allow the start marker to be not at the beginning of its line, which would create structures very very close to shell here-documents: CREATE FUNCTION foo() RETURNS int AS $$FUNCTION... text here ...$$FUNCTION This would be a little more open to typos --- if you leave out the space so that it reads CREATE FUNCTION foo() RETURNS int AS$$FUNCTION then you wrote an identifier, not AS followed by a here-document marker. But it might be worth defining it that way anyway because of the similarity to shell notation. Or maybe I'm assuming too much about whether the average SQL programmer has ever heard of shell here-documents. regards, tom lane
Bruce Momjian <pgman@candle.pha.pa.us> writes: > So you are requiring the identical text to appear at the beginning and > end of the quote, rather than a here document that would be: > <<END > ... > END > or in your example: > $$END > ... > END Yes, I was thinking of requiring the $$ to appear at both beginning and end. This is perhaps not critical, but it seems more symmetric that way. Also, we might even be able to get away with allowing SQL text to resume on the same line as the terminator, for example CREATE FUNCTION foo() RETURNS int AS $$FUNCTION... text here ...$$FUNCTION LANGUAGE plpgsql; I would not want to risk that with a plain word as terminator, but $$ helps a lot to make it distinctive. regards, tom lane
Jon Jensen wrote: > On Thu, 11 Sep 2003, Bruce Momjian wrote: > > > > I would definitely like to see us adopt a proposal that is like > > > here-documents to the extent that there's a family of possible > > > terminator markers and not only one. But we'll have to adjust the > > > syntax a little bit. If you don't like $$FOO, what else comes to mind? > > > > Couldn't we allow << at the beginning of the line to mean 'here' document? > > Even if that worked, it diverges so much from shell and Perl here document > syntax (where it's customary to have it at the end of the line, or even in > the middle), it wouldn't be any more familiar than anything else. > > I think Tom's $$FOO suggestion is fine -- it just takes getting used to. Oh, sorry, I forgot 'here' documents don't have to start the line. I was looking at the manual page and not thinking of the code I write with here documents. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian writes: > Tom Lane wrote: > > Jon Jensen <jon@endpoint.com> writes: > > > Is there a reason not to use here documents? > > > > The $$FOO proposal I put forward earlier was consciously modeled on > > here-documents. We cannot use exactly the shell syntax for > > here-documents, though, mainly because we already have meaning assigned > > to strings like <<' (<< is already a standard operator, and the ' could > > be the start of an ordinary literal). > > > > I would definitely like to see us adopt a proposal that is like > > here-documents to the extent that there's a family of possible > > terminator markers and not only one. But we'll have to adjust the > > syntax a little bit. If you don't like $$FOO, what else comes to mind? > > Couldn't we allow << at the beginning of the line to mean 'here' document? What about the Python approach: The literal text is enclosed either in a pair of three single quotes or three double quotes. So you can do (e.g. in the python shell) > s = """ This is 'one' string. """ to set string s or > s = ''' This is 'one' string. ''' Even > s = """ This is 'one' string with '''triple single quotes''' and some "double" quotes. """ works: > print s This is 'one' string with '''triple single quotes''' and some "double" quotes. Bye, Tilo
On Thu, 11 Sep 2003, Tom Lane wrote: > It is possible that we could allow the start marker to be not at the > beginning of its line, which would create structures very very close > to shell here-documents: > > CREATE FUNCTION foo() RETURNS int AS $$FUNCTION > ... text here ... > $$FUNCTION > > This would be a little more open to typos --- if you leave out the space > so that it reads > > CREATE FUNCTION foo() RETURNS int AS$$FUNCTION > > then you wrote an identifier, not AS followed by a here-document marker. > But it might be worth defining it that way anyway because of the > similarity to shell notation. Or maybe I'm assuming too much about > whether the average SQL programmer has ever heard of shell > here-documents. Perl's adopting shell here-document syntax has made them pretty well-known even outside of Unix, but if we diverge at all we'd might as well do what makes the most sense for PostgreSQL. I think your proposal above is good, at least enough to write code for and test with. Jon
Tilo Schwarz <mail@tilo-schwarz.de> writes: > What about the Python approach: The literal text is enclosed either in a pair > of three single quotes or three double quotes. That might be okay if we were working in a vacuum, but we aren't. Among other things, the SQL spec tells us what 'quote''' ... means, and it's not Python-compatible. regards, tom lane
Tom Lane wrote: >Bruce Momjian <pgman@candle.pha.pa.us> writes: > > >>So you are requiring the identical text to appear at the beginning and >>end of the quote, rather than a here document that would be: >> >> > > > >> <<END >> ... >> END >> >> > > > >>or in your example: >> >> > > > >> $$END >> ... >> END >> >> > >Yes, I was thinking of requiring the $$ to appear at both beginning and >end. This is perhaps not critical, but it seems more symmetric that >way. Also, we might even be able to get away with allowing SQL text to >resume on the same line as the terminator, for example > > CREATE FUNCTION foo() RETURNS int AS $$FUNCTION > ... text here ... > $$FUNCTION LANGUAGE plpgsql; > >I would not want to risk that with a plain word as terminator, but $$ >helps a lot to make it distinctive. > > This discussion now seems to be on the right track - thanks for recognising a need, meeting which will clearly improve the useability of the product. cheers andrew
Bruce Momjian <pgman@candle.pha.pa.us> writes: > ... You are saying it is safer to > make it at the beginnning of a line _and_ have it be something that > isn't used in SQL, but $$ is used in Perl, so I don't see the big > advantage either way --- once you say X has to begin at the beginning of > the line, we are already making things breakable by a newline, no? Keep in mind that we have two different requirements: the quote start marker has to be recognizable while we are parsing SQL (or possibly plpgsql) code. The quote end marker has to be recognizable while we are scanning text that could be almost anything. The cute thing about the here-document solution to this problem is that you can choose the quote end marker on a case-by-case basis. So you can always pick something that won't conflict with anything that's actually in the text you need to quote. If we try to go with fixed markers (like {' ... }' and some other ideas that were floated today), then we lose that flexibility, and we're up against the losing game of trying to pick an end-marker that won't cause problems in any programming language anywhere (not to mention possible uses of the quoting mechanism for arbitrary string literals that aren't even function bodies). I'm not wedded to the "$$FOO" idea in particular, but I do think we want to go with a solution that allows a variable end-marker. regards, tom lane
Tom Lane wrote: > Keep in mind that we have two different requirements: the quote start > marker has to be recognizable while we are parsing SQL (or possibly > plpgsql) code. The quote end marker has to be recognizable while we are > scanning text that could be almost anything. > > The cute thing about the here-document solution to this problem is that > you can choose the quote end marker on a case-by-case basis. So you can > always pick something that won't conflict with anything that's actually > in the text you need to quote. > > If we try to go with fixed markers (like {' ... }' and some other ideas > that were floated today), then we lose that flexibility, and we're up > against the losing game of trying to pick an end-marker that won't cause > problems in any programming language anywhere (not to mention possible > uses of the quoting mechanism for arbitrary string literals that aren't > even function bodies). > > I'm not wedded to the "$$FOO" idea in particular, but I do think we want > to go with a solution that allows a variable end-marker. Agreed --- with multiple languages, we have to allow users to specify a unique end marker. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> >> The $$FOO proposal I put forward earlier was consciously modeled on > >> here-documents. > > > Couldn't we allow << at the beginning of the line to mean 'here' document? > > No; you could easily be breaking existing queries, for example Let me jump in for half a second here (no pun intended), but what about the use of back quotes? ` `? Use a very limited escaping policy of \` => ` and \\ => \ . Back quotes aren't used in any SQL or PL/PgSQL that I'm aware of. I know ruby, perl, and sh make use of back ticks, but for all intents and purposes, pl/perl shouldn't use back ticks, and if it is, for performance reasons a DBA shouldn't. Use of open() or system() should be encouraged as it is possible to avoid fork()/exec()'ing a shell, nevermind that ``'s aren't the wisest inclusions for DBAs. If ``'s are needed in the pl language, they can be nominally included with a \`\`, but given their relative rareness compared to ' or ", I'd think the addition of ` would be welcome and much less cumbersome/easier to remember than other options that have popped up in this thread. $0.02 -sc -- Sean Chittenden
Sean Chittenden <sean@chittenden.org> writes: > Let me jump in for half a second here (no pun intended), but what > about the use of back quotes? ` `? Use a very limited escaping policy > of \` => ` and \\ => \ . Actually, having to double backslashes is one of the things I want to get rid of. The here-document-based ideas seem to allow that. regards, tom lane
> > Let me jump in for half a second here (no pun intended), but what > > about the use of back quotes? ` `? Use a very limited escaping > > policy of \` => ` and \\ => \ . > > Actually, having to double backslashes is one of the things I want > to get rid of. The here-document-based ideas seem to allow that. Hrm, that would be nice to get rid of as \ is a highly overloaded, overused character. As someone who is presently in the throws of writing a new language, might I suggest using non-newline anchored token as opposed to more dynamic token? Using $$[.*]\n as a lexical token is a quasi-problematic as the anchor is the newline, something that SQL has been free of for as long as I'm aware of. By using a static lexical token, such as @@, newline's aren't important, thus reducing the number of accidental syntax errors from programmers. While I abhor the "let's put a magic token in this context to handle this quirk" grammar design methodology that Perl has brought, I do think that a simple doubling up of a nearly unused operator would be sufficient, concise, and easy. For example: !! Invalid as !! is a valid expression, though a NOOP. @@ Valid candidate as @@ is an invalid expression ## Valid candidate, but common comment syntax, avoid using $$ Valid candidate, but again, a common syntax in shell like languages %% Valid candidate, %% is an invalid expression ^^ Invalid candidate, ^^ is a valid expression && Invalid as && is a valid token ** Valid candidate, but ** is used as a power operator in Ruby Of the above, I'd think @@, %%, or $$ would be the best choices. If a dynamic token is desired, use a token that is terminally anchored with something other than a new line to keep PostgreSQL's SQL contextually free from newlines. If the desire for something HERE document-like is strong enough... well, how about the following flex patterns: @(@[^\n]+\n|[^@]*@) %(%[^\n]+\n|[^%]*%) $($[^\n]+\n|[^$]*$) If the developer knows his/her string and opts to use an empty string to name the token, so be it, @@ would be the beginning and terminating token for a literal string block. If the developer writing something with pl/autoconf (doesn't exist!!! Just an example of where @@ is used), then @autoconf me harder@ could be used as the start and ending token, which should provide enough bits to prevent the likelihood of the string being used in the enclosed data. If a newline is desired, it would be valid in the above: @ @ Inside the block @ @ @z@Inside the block@z@ and the resulting string would be " Inside the block ". %{ /* Headers/definitions/prototypes */ #include <string.h> static bool initialized = false; static char *lit_name; static char *lit_val; %} lit_quote_pattern @(@[^\n]+\n|[^@]*@) %x LIT_QUOTE %x SQL %% %{ /* Init bits */if (!initialized) { BEGIN(SQL); initialized = true;} %} <SQL>{lit_quote_pattern} { /* -2 == leading/trailing chars, +1 '\0' = -1*/ lit_name = malloc(yyleng - 1); strncpy(&lit_name,&yyleng[1], yyleng - 2); lit_name[yyleng-1] = '\0'; lit_val = NULL; BEGIN(LIT_QUOTE);} <LIT_QUOTE>{lit_quote_pattern} { /* */ if (strncmp(lit_name, yytext[1], yyleng - 2) == 0) { /* Found the terminator,set yylval.??? to lit_val after appending yytext and return whatever the string type is to yyparse() */ yylval.???= strdup(lit_val); free(lit_val); free(lit_name); lit_name = lit_val = NULL; BEGIN(SQL); return(tSTRING); } else { /* Do nothing until we hit a match */ }} <LIT_QUOTE>. { /* Not sure these func names off the top of my head: */ pg_append_str_to_buf(lit_val, yytext, yyleng);} %% /* Or something similarly flexible */ -sc -- Sean Chittenden
Sean Chittenden <sean@chittenden.org> writes: > Using $$[.*]\n as a lexical token is a quasi-problematic as the anchor > is the newline, something that SQL has been free of for as long as I'm > aware of. By using a static lexical token, such as @@, newline's > aren't important, thus reducing the number of accidental syntax errors > from programmers. While I abhor the "let's put a magic token in this > context to handle this quirk" grammar design methodology that Perl has > brought, I do think that a simple doubling up of a nearly unused > operator would be sufficient, concise, and easy. No, it absolutely would not be. We could pick something that would not create conflicts as a quote start marker, since we know what the SQL grammar is. But it would be guaranteed to fail as an end marker. Let me remind you that the goal here is to be able to quote any text whatever, including nested uses of the same quoting mechanism. After sleeping on it, I do think that tying the mechanism to newlines is just unnecessary complication. I'm currently leaning to an idea that was suggested yesterday by (I think) Andreas: let the quote start marker be a token of the formdollarsign zero-or-more-letters dollarsign and let the quote body extend to the next occurrence of the identical string. For example... $Q$Joe's house$Q$ ... is equivalent to... 'Joe''s house' ... This is extremely compact for quoting strings that don't contain any doubled dollar signs, since you don't need any letters at all. I could see $$text$$ becoming a very common way to quote material that contains single quotes or backslashes. But since you can choose any string of letters to make up the terminating token, the mechanism is able to quote any text whatever, including nested occurrences of the same structure (with a different letterstring of course). Note that there is no particular need to insist on any nearby newlines. If the construct is written just following an identifier or keyword, then you do need some intervening whitespace to keep the $Q$ from being read as part of that identifier, but I doubt this will bother anyone. Note that I'm allowing only letters, not digits, in the string; this avoids any possible ambiguity with $n parameter tokens. We have no other SQL tokens that are allowed to start with $, so this creates no other lexical ambiguity. Comments? regards, tom lane
Tom Lane wrote: > Note that there is no particular need to insist on any nearby newlines. > If the construct is written just following an identifier or keyword, > then you do need some intervening whitespace to keep the $Q$ from being > read as part of that identifier, but I doubt this will bother anyone. > > Note that I'm allowing only letters, not digits, in the string; this > avoids any possible ambiguity with $n parameter tokens. We have no > other SQL tokens that are allowed to start with $, so this creates no > other lexical ambiguity. Sounds interesting. So it is $$text$$ or $quote$text$quote$? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Tom Lane wrote: >> Note that there is no particular need to insist on any nearby newlines. >> If the construct is written just following an identifier or keyword, >> then you do need some intervening whitespace to keep the $Q$ from being >> read as part of that identifier, but I doubt this will bother anyone. >> >> Note that I'm allowing only letters, not digits, in the string; this >> avoids any possible ambiguity with $n parameter tokens. We have no >> other SQL tokens that are allowed to start with $, so this creates no >> other lexical ambiguity. > Sounds interesting. So it is $$text$$ or $quote$text$quote$? Either would work, yes. regards, tom lane
On Fri, 12 Sep 2003, Tom Lane wrote: > I'm currently leaning to an idea that was suggested yesterday by (I > think) Andreas: let the quote start marker be a token of the form > dollarsign zero-or-more-letters dollarsign > and let the quote body extend to the next occurrence of the identical > string. For example > ... $Q$Joe's house$Q$ ... > is equivalent to > ... 'Joe''s house' ... > > This is extremely compact for quoting strings that don't contain any > doubled dollar signs, since you don't need any letters at all. I could > see $$text$$ becoming a very common way to quote material that contains > single quotes or backslashes. But since you can choose any string of > letters to make up the terminating token, the mechanism is able to quote > any text whatever, including nested occurrences of the same structure > (with a different letterstring of course). > > Note that there is no particular need to insist on any nearby newlines. > If the construct is written just following an identifier or keyword, > then you do need some intervening whitespace to keep the $Q$ from being > read as part of that identifier, but I doubt this will bother anyone. And it's still far less onerous a requirement than here documents have with their newlines. > Note that I'm allowing only letters, not digits, in the string; this > avoids any possible ambiguity with $n parameter tokens. We have no > other SQL tokens that are allowed to start with $, so this creates no > other lexical ambiguity. I like it. Bruce said: > Sounds interesting. So it is $$text$$ or $quote$text$quote$? Either way, as I understand the proposal. Jon
Sean Chittenden <sean@chittenden.org> writes: > ... then @autoconf me harder@ could be used as the start and ending > token, Hm, I should have read your message more carefully --- I missed the bit at the middle where you propose nearly the same idea I had ;-). But the flex patterns you wrote don't actually support this do they? > @(@[^\n]+\n|[^@]*@) > %(%[^\n]+\n|[^%]*%) > $($[^\n]+\n|[^$]*$) Doesn't quite seem to do what we're talking about here. I don't see a need for three of these; that just eats up lexical token space. Is there a reason for the $ variant not to be enough? In any case, @ and % are valid (and popular) operator names in Postgres, so we could not use them for this purpose without removing that meaning, which would be painful. regards, tom lane
Tom Lane wrote: >After sleeping on it, I do think that tying the mechanism to newlines >is just unnecessary complication. I'm currently leaning to an idea that >was suggested yesterday by (I think) Andreas: let the quote start marker >be a token of the form > dollarsign zero-or-more-letters dollarsign >and let the quote body extend to the next occurrence of the identical >string. For example > ... $Q$Joe's house$Q$ ... >is equivalent to > ... 'Joe''s house' ... > >This is extremely compact for quoting strings that don't contain any >doubled dollar signs, since you don't need any letters at all. I could >see $$text$$ becoming a very common way to quote material that contains >single quotes or backslashes. But since you can choose any string of >letters to make up the terminating token, the mechanism is able to quote >any text whatever, including nested occurrences of the same structure >(with a different letterstring of course). > >Note that there is no particular need to insist on any nearby newlines. >If the construct is written just following an identifier or keyword, >then you do need some intervening whitespace to keep the $Q$ from being >read as part of that identifier, but I doubt this will bother anyone. > >Note that I'm allowing only letters, not digits, in the string; this >avoids any possible ambiguity with $n parameter tokens. We have no >other SQL tokens that are allowed to start with $, so this creates no >other lexical ambiguity. > >Comments? > > > I like it. It is really quite similar to perl's q$text$ mechanism, but making allowances for the fact we are in a multi-language environment. I presume the delimiter will never be kept, but eaten by the lexer. I'd like to see pg_dump use this mechanism for quoting, at least for function bodies. I guess it could retrieve the text and then keep generating delimiters until it found one that didn't occur inside the text. Maybe for that purpose we could allow underscores as well as letters - I don't think that should introduce any extra ambiguities. Alternatively, or as well, maybe leading and trailing digits could be disallowed, but embedded digits could be allowed. IOW let's be as liberal as possible without breaking things. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > I'd like to see pg_dump use this mechanism for quoting, at least for > function bodies. I guess it could retrieve the text and then keep > generating delimiters until it found one that didn't occur inside the > text. Right, that was what I had in mind. > Maybe for that purpose we could allow underscores as well as > letters - I don't think that should introduce any extra ambiguities. > Alternatively, or as well, maybe leading and trailing digits could be > disallowed, but embedded digits could be allowed. IOW let's be as > liberal as possible without breaking things. <shrug> I'd prefer to keep the rules as simple as possible. No particular objection to underscores, but allowing digits in only some places seems confusing. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > I'd like to see pg_dump use this mechanism for quoting, at least for > > function bodies. I guess it could retrieve the text and then keep > > generating delimiters until it found one that didn't occur inside the > > text. > > Right, that was what I had in mind. > > > Maybe for that purpose we could allow underscores as well as > > letters - I don't think that should introduce any extra ambiguities. > > Alternatively, or as well, maybe leading and trailing digits could be > > disallowed, but embedded digits could be allowed. IOW let's be as > > liberal as possible without breaking things. > > <shrug> I'd prefer to keep the rules as simple as possible. No > particular objection to underscores, but allowing digits in only some > places seems confusing. Yep. If we can do C indentifiers (no digits to start), fine, if not, we just have to disallow them. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>I'd like to see pg_dump use this mechanism for quoting, at least for >>function bodies. I guess it could retrieve the text and then keep >>generating delimiters until it found one that didn't occur inside the >>text. >> >> > >Right, that was what I had in mind. > > great > > >>Maybe for that purpose we could allow underscores as well as >>letters - I don't think that should introduce any extra ambiguities. >>Alternatively, or as well, maybe leading and trailing digits could be >>disallowed, but embedded digits could be allowed. IOW let's be as >>liberal as possible without breaking things. >> >> > ><shrug> I'd prefer to keep the rules as simple as possible. No >particular objection to underscores, but allowing digits in only some >places seems confusing. > > > We already do - you can't begin an identifier with a digit. I'm not dogmatic about it, though. I'd be more than happy with just what you proposed. cheers andrew
> > ... then @autoconf me harder@ could be used as the start and > > ending token, > > Hm, I should have read your message more carefully --- I missed the > bit at the middle where you propose nearly the same idea I had ;-). > But the flex patterns you wrote don't actually support this do they? They support both your original idea of "$$FOO\n" and "$FOO$" as tokens. > > @(@[^\n]+\n|[^@]*@) > > %(%[^\n]+\n|[^%]*%) > > $($[^\n]+\n|[^$]*$) > > Doesn't quite seem to do what we're talking about here. The only reason I listed all three is I don't believe that there would be any more lexical ambiguity than with any of the other operators. scan.l, line 195: op_chars [\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=] At line 428 in scan.l, include: \@[\^\n\@]*\@ { /* Will take lexical precedence over {operator} */ } And that should be the end of it since this new token returns a string, just the same way that ' does, but without any escaping done. > I don't see a need for three of these; that just eats up lexical > token space. Is there a reason for the $ variant not to be enough? Nope, no reason why it isn't enough. $ is just as valid as any of the other characters. In my mind, since @ has a more limited use than $ in a global context (perl, shell, ruby), I figure its use to be safer than @, even though PostgreSQL uses @ for abs (rarely seen in the wild from my experience, and has a lower match precedence in flex). > In any case, @ and % are valid (and popular) operator names in > Postgres, so we could not use them for this purpose without removing > that meaning, which would be painful. Really? Hrm, guess @'s out. $ works for me. -sc -- Sean Chittenden
Tilo Schwarz wrote: > What about the Python approach: The literal text is enclosed either in a pair > of three single quotes or three double quotes. So you can do (e.g. in the > python shell) It'll only make plpyhon functions harder to write, if you need to use longstring quoting INSIDE your plpython functions yourself, e.g. documentation strings. 'Here' documents seems like the most sensible solution to me so far. -- Linux homer 2.4.18-14 #1 Wed Sep 4 13:35:50 EDT 2002 i686 i686 i386 GNU/Linux 12:30pm up 261 days, 3:54, 5 users, load average: 5.21, 5.12, 5.19
Attached is a proof-of-concept diff that makes the main SQL parser accept quoted strings in the style recently discussed ($Q$ ... $Q$). To complete the feature, we'd need to make the plpgsql parser do the same, update psql to understand this, update pg_dump to use this style of quoting for function bodies (and perhaps other things?), and update the documentation. The last is a nontrivial project all by itself. Anybody want to run with this football? regards, tom lane *** src/backend/parser/scan.l.orig Sun Aug 3 23:00:44 2003 --- src/backend/parser/scan.l Sun Sep 14 01:38:55 2003 *************** *** 39,44 **** --- 39,46 ---- static int xcdepth = 0; /* depth of nesting in slash-star comments */ + static char *hqstart; /* current here-document start string */ + /* * literalbuf is used to accumulate literal values when multiple rules * are needed to parse a single literal. Callstartlit to reset buffer *************** *** 95,100 **** --- 97,103 ---- * <xd> delimited identifiers (double-quoted identifiers) * <xh> hexadecimal numeric string * <xq> quotedstrings + * <hq> here-document-style quoted strings */ %x xb *************** *** 102,107 **** --- 105,111 ---- %x xd %x xh %x xq + %x hq /* Bit string * It is tempting to scan the string for only those characters *************** *** 141,146 **** --- 145,157 ---- xqoctesc [\\][0-7]{1,3} xqcat {quote}{whitespace_with_newline}{quote} + /* Here-document-style quotes + * The quoted string starts with $letters$ and extends to the first occurrence + * of an identical string. There is *no* processing of the quoted text. + */ + hqdelim \$[A-Za-z]*\$ + hqinside [^$]+ + /* Double quote * Allows embedded spaces and other special characters into identifiers. */ *************** *** 387,392 **** --- 398,432 ---- } <xq><<EOF>> { yyerror("unterminated quoted string"); } + {hqdelim} { + token_start = yytext; + hqstart = pstrdup(yytext); + BEGIN(hq); + startlit(); + } + <hq>{hqdelim} { + if (strcmp(yytext, hqstart) == 0) + { + pfree(hqstart); + BEGIN(INITIAL); + yylval.str = litbufdup(); + return SCONST; + } + /* + * When we fail to match $...$ to hqstart, transfer + * the $... part to the output, but put back the final + * $ for rescanning. Consider $delim$...$junk$delim$ + */ + addlit(yytext, yyleng-1); + yyless(yyleng-1); + } + <hq>{hqinside} { + addlit(yytext, yyleng); + } + <hq>. { + addlitchar(yytext[0]); + } + <hq><<EOF>> { yyerror("unterminated special-quoted string"); } {xdstart} { token_start= yytext;
A quick browse through the current doc set suggests that sections 4.1, 38 to 42, and the page on 'create function' at least will need changes. I think calling it 'here-document' quoting is possibly unwise - it is sufficiently different from here documents in shell and perl contexts to make it confusing. We could call it meta-quoting, or alternative quoting, maybe. cheers andrew ----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> To: "Sean Chittenden" <sean@chittenden.org> Cc: "Bruce Momjian" <pgman@candle.pha.pa.us>; "Jon Jensen" <jon@endpoint.com>; <pgsql-hackers@postgresql.org> Sent: Sunday, September 14, 2003 1:49 AM Subject: Re: [HACKERS] massive quotes? > Attached is a proof-of-concept diff that makes the main SQL parser > accept quoted strings in the style recently discussed ($Q$ ... $Q$). > To complete the feature, we'd need to make the plpgsql parser do the > same, update psql to understand this, update pg_dump to use this style > of quoting for function bodies (and perhaps other things?), and update > the documentation. The last is a nontrivial project all by itself. > Anybody want to run with this football? > > regards, tom lane > > > *** src/backend/parser/scan.l.orig Sun Aug 3 23:00:44 2003 > --- src/backend/parser/scan.l Sun Sep 14 01:38:55 2003 > *************** > *** 39,44 **** > --- 39,46 ---- > > static int xcdepth = 0; /* depth of nesting in slash-star comments */ > > + static char *hqstart; /* current here-document start string */ > + > /* > * literalbuf is used to accumulate literal values when multiple rules > * are needed to parse a single literal. Call startlit to reset buffer > *************** > *** 95,100 **** > --- 97,103 ---- > * <xd> delimited identifiers (double-quoted identifiers) > * <xh> hexadecimal numeric string > * <xq> quoted strings > + * <hq> here-document-style quoted strings > */ > > %x xb > *************** > *** 102,107 **** > --- 105,111 ---- > %x xd > %x xh > %x xq > + %x hq > > /* Bit string > * It is tempting to scan the string for only those characters > *************** > *** 141,146 **** > --- 145,157 ---- > xqoctesc [\\][0-7]{1,3} > xqcat {quote}{whitespace_with_newline}{quote} > > + /* Here-document-style quotes > + * The quoted string starts with $letters$ and extends to the first occurrence > + * of an identical string. There is *no* processing of the quoted text. > + */ > + hqdelim \$[A-Za-z]*\$ > + hqinside [^$]+ > + > /* Double quote > * Allows embedded spaces and other special characters into identifiers. > */ > *************** > *** 387,392 **** > --- 398,432 ---- > } > <xq><<EOF>> { yyerror("unterminated quoted string"); } > > + {hqdelim} { > + token_start = yytext; > + hqstart = pstrdup(yytext); > + BEGIN(hq); > + startlit(); > + } > + <hq>{hqdelim} { > + if (strcmp(yytext, hqstart) == 0) > + { > + pfree(hqstart); > + BEGIN(INITIAL); > + yylval.str = litbufdup(); > + return SCONST; > + } > + /* > + * When we fail to match $...$ to hqstart, transfer > + * the $... part to the output, but put back the final > + * $ for rescanning. Consider $delim$...$junk$delim$ > + */ > + addlit(yytext, yyleng-1); > + yyless(yyleng-1); > + } > + <hq>{hqinside} { > + addlit(yytext, yyleng); > + } > + <hq>. { > + addlitchar(yytext[0]); > + } > + <hq><<EOF>> { yyerror("unterminated special-quoted string"); } > > {xdstart} { > token_start = yytext; > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html
"Andrew Dunstan" <andrew@dunslane.net> writes: > I think calling it 'here-document' quoting is possibly unwise - it is > sufficiently different from here documents in shell and perl contexts to > make it confusing. I agree. I've tried to think of a better alternative name, but without much success. > We could call it meta-quoting, or alternative quoting, maybe. Those seem pretty unmemorable and content-free, though. Any other ideas out there? regards, tom lane PS: btw, I have realized that the seemingly obvious algorithm for choosing a delimiter string for given text is wrong. I had imagined it as "try $$, $Q$, $QQ$, $QQQ$, etc, until you find a delimiter not present in the given text string". This is not right because, for example, if the string ends with $Q then $Q$ is not a usable delimiter ($Q$...$Q$Q$ would be misparsed). The simplest correct algorithm is "try $, $Q, $QQ, $QQQ, etc until you find a string not present in the given text string; then use that with a $ appended".
jon@endpoint.com (Jon Jensen) wrote in message news:<Pine.LNX.4.58.0309111950010.7142@louche.swelter.net>... > INSERT INTO sometable (5, <<\. > a > very long > string > \. > ); I'm delighted to hear that here docs are being discussed for postgres.In the world of Perl here docs make life SO MUCH easierby allowing for independent quoting scopes. Jon, I'm totally in agreement with you that here docs would be a valuable addition to postgres. I'd like to propose a few modifications on the syntax you suggest above. In Perl, the rule for here docs is NOT "it starts immediately after the definition of the terminator" (as in your example). The rule is "it starts on the first line after the end of the command in which the terminator appears". There's a very good reason for this distinction.That reason is that it allows the command to be succinct,usually a single line, and the here doc stays completely separate. So, your example would like this instead: INSERT INTO sometable (5, <<'(END)');avery longstring(END) Notice that the entire INSERT command is on one line. Very easy to read. Note also that the terminator definition is quoted, which simplifies any concerns about special characters ending the terminator definition. One other small tweak. In Perl 5 the rule is that the terminator must be flush to the left of the line. That means that here docs can mess up otherwise nicely indented code. That issue is being fixed in Perl 6, where the terminator can be indented with the rest of the code. Each line of the contents of the here doc are left trimmed by the same amount of whitespace as the indentation of the terminator. I suggest we do the same on postgres: if some_test() then INSERT INTO sometable (5, <<'(END)'); a very long string blah blah (END) end; In this example, the very long strong is evaluated such that each line actually begins with the alphanumeric character, except the last line ("blah blah") which begins with four spaces. -Miko
Tom Lane kirjutas P, 14.09.2003 kell 18:58: > "Andrew Dunstan" <andrew@dunslane.net> writes: > > I think calling it 'here-document' quoting is possibly unwise - it is > > sufficiently different from here documents in shell and perl contexts to > > make it confusing. > > I agree. I've tried to think of a better alternative name, but without > much success. > > > We could call it meta-quoting, or alternative quoting, maybe. > > Those seem pretty unmemorable and content-free, though. Any other ideas > out there? Considering that we use $$ instead of quotes we could call it dollarring instead of quoting ;) Or if we were politicians we could call it "memorable contentful meta-quoting" to squish any opposition in the bud ;) ;) I've done lot of programming in python, and while the python tradition generally calls for exactly one syntax for one concept, they have at least 4 ways for quoting -- ' and " for one-line strings and ''' and """ for multi-line ones. While this could have been transferrable to Pl/PgSQL, it would not fly for full pl/python. So I would propose "adaptable quoting" or "multi-language-aware" (or -savvy for mac-heads) quoting, or just dollar-quoting, $-quoting or $x$ quoting. None of these is completely self-explanatory but that is what TFM is for. To be nearly self-explanatory for we could use '"$[a-z]*$"-quoting' but then someone would have to post .wav files with the correct spelling :) double-dollar quoting feels good to say aloud ... -------- Hannu
Hannu Krosing <hannu@tm.ee> writes: > Tom Lane kirjutas P, 14.09.2003 kell 18:58: >> Those seem pretty unmemorable and content-free, though. Any other ideas >> out there? > Considering that we use $$ instead of quotes we could call it dollarring > instead of quoting ;) > ... > double-dollar quoting feels good to say aloud ... Hm, "dollar quoting" doesn't sound too bad. I could go with that, unless someone has a better idea? regards, tom lane
----- Original Message ----- From: "Tom Lane" <tgl@sss.pgh.pa.us> > Hannu Krosing <hannu@tm.ee> writes: > > Tom Lane kirjutas P, 14.09.2003 kell 18:58: > >> Those seem pretty unmemorable and content-free, though. Any other ideas > >> out there? > > > Considering that we use $$ instead of quotes we could call it dollarring > > instead of quoting ;) > > ... > > double-dollar quoting feels good to say aloud ... > > Hm, "dollar quoting" doesn't sound too bad. I could go with that, > unless someone has a better idea? > I would prefer a name that says what it does rather than what it looks like. But I can't think of a good one :-) 'Unescaped quoting' is more or less what it does, but while not content-free :-) is less than memorable. *shrug* I don't care that much that I want to spend lots of time thinking about it. cheers andrew
On Mon, 15 Sep 2003, Tom Lane wrote: > Hm, "dollar quoting" doesn't sound too bad. I could go with that, > unless someone has a better idea? I don't mind "dollar quoting". How about "block quoting"? That describes what it's used for, rather than what it looks like. Jon
On Sat, 13 Sep 2003, Miko O'Sullivan wrote: > jon@endpoint.com (Jon Jensen) wrote in message news:<Pine.LNX.4.58.0309111950010.7142@louche.swelter.net>... > > INSERT INTO sometable (5, <<\. > > a > > very long > > string > > \. > > ); > > > I'm delighted to hear that here docs are being discussed for postgres. > In the world of Perl here docs make life SO MUCH easier by allowing > for independent quoting scopes. Jon, I'm totally in agreement with you > that here docs would be a valuable addition to postgres. I'd like to > propose a few modifications on the syntax you suggest above. > > In Perl, the rule for here docs is NOT "it starts immediately after > the definition of the terminator" (as in your example). The rule is > "it starts on the first line after the end of the command in which the > terminator appears". There's a very good reason for this distinction. > That reason is that it allows the command to be succinct, usually a > single line, and the here doc stays completely separate. So, your > example would like this instead: > > INSERT INTO sometable (5, <<'(END)'); > a > very long > string > (END) > > Notice that the entire INSERT command is on one line. Very easy to > read. I'm aware of that distinction, and use it to advantage in Perl. It would make parsing harder, though. > One other small tweak. In Perl 5 the rule is that the terminator must > be flush to the left of the line. That means that here docs can mess > up otherwise nicely indented code. That issue is being fixed in Perl > 6, where the terminator can be indented with the rest of the code. That also would make parsing harder, and Perl 6 is not yet real, so it'd be deviating from historical shell and Perl here document syntax. In the end it looks like consensus has settled on a new syntax with some of the benefits of here documents without the requirement that the end token be on its own blank line (and thus necessitating end-of-line checks) and perhaps more importantly, not requiring that everything quoted in a here document always end with a newline. Jon
sendmail.cf Sean Chittenden wrote: >> >> The $$FOO proposal I put forward earlier was consciously modeled on >> >> here-documents. >> >> > Couldn't we allow << at the beginning of the line to mean 'here' document? >> >> No; you could easily be breaking existing queries, for example > > Let me jump in for half a second here (no pun intended), but what > about the use of back quotes? ` `? Use a very limited escaping policy > of \` => ` and \\ => \ . Back quotes aren't used in any SQL or > PL/PgSQL that I'm aware of. I know ruby, perl, and sh make use of > back ticks, but for all intents and purposes, pl/perl shouldn't use > back ticks, and if it is, for performance reasons a DBA shouldn't. > Use of open() or system() should be encouraged as it is possible to > avoid fork()/exec()'ing a shell, nevermind that ``'s aren't the wisest > inclusions for DBAs. If ``'s are needed in the pl language, they can > be nominally included with a \`\`, but given their relative rareness > compared to ' or ", I'd think the addition of ` would be welcome and > much less cumbersome/easier to remember than other options that have > popped up in this thread. > > $0.02 > > -sc > -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Can we add digits to the allowed character set of the marker? It'd make life easier for languages that check if the quoting marker actually occurs in the text to be quoted. Jan Tom Lane wrote: > Sean Chittenden <sean@chittenden.org> writes: >> Using $$[.*]\n as a lexical token is a quasi-problematic as the anchor >> is the newline, something that SQL has been free of for as long as I'm >> aware of. By using a static lexical token, such as @@, newline's >> aren't important, thus reducing the number of accidental syntax errors >> from programmers. While I abhor the "let's put a magic token in this >> context to handle this quirk" grammar design methodology that Perl has >> brought, I do think that a simple doubling up of a nearly unused >> operator would be sufficient, concise, and easy. > > No, it absolutely would not be. We could pick something that would not > create conflicts as a quote start marker, since we know what the SQL > grammar is. But it would be guaranteed to fail as an end marker. Let > me remind you that the goal here is to be able to quote any text > whatever, including nested uses of the same quoting mechanism. > > After sleeping on it, I do think that tying the mechanism to newlines > is just unnecessary complication. I'm currently leaning to an idea that > was suggested yesterday by (I think) Andreas: let the quote start marker > be a token of the form > dollarsign zero-or-more-letters dollarsign > and let the quote body extend to the next occurrence of the identical > string. For example > ... $Q$Joe's house$Q$ ... > is equivalent to > ... 'Joe''s house' ... > > This is extremely compact for quoting strings that don't contain any > doubled dollar signs, since you don't need any letters at all. I could > see $$text$$ becoming a very common way to quote material that contains > single quotes or backslashes. But since you can choose any string of > letters to make up the terminating token, the mechanism is able to quote > any text whatever, including nested occurrences of the same structure > (with a different letterstring of course). > > Note that there is no particular need to insist on any nearby newlines. > If the construct is written just following an identifier or keyword, > then you do need some intervening whitespace to keep the $Q$ from being > read as part of that identifier, but I doubt this will bother anyone. > > Note that I'm allowing only letters, not digits, in the string; this > avoids any possible ambiguity with $n parameter tokens. We have no > other SQL tokens that are allowed to start with $, so this creates no > other lexical ambiguity. > > Comments? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Tom Lane wrote: > "Andrew Dunstan" <andrew@dunslane.net> writes: >> I think calling it 'here-document' quoting is possibly unwise - it is >> sufficiently different from here documents in shell and perl contexts to >> make it confusing. > > I agree. I've tried to think of a better alternative name, but without > much success. > >> We could call it meta-quoting, or alternative quoting, maybe. > > Those seem pretty unmemorable and content-free, though. Any other ideas > out there? textembed markedtext hereliteral markedliteral litter > > regards, tom lane > > PS: btw, I have realized that the seemingly obvious algorithm for > choosing a delimiter string for given text is wrong. I had imagined it > as "try $$, $Q$, $QQ$, $QQQ$, etc, until you find a delimiter not > present in the given text string". This is not right because, for > example, if the string ends with $Q then $Q$ is not a usable delimiter > ($Q$...$Q$Q$ would be misparsed). The simplest correct algorithm is > "try $, $Q, $QQ, $QQQ, etc until you find a string not present in the > given text string; then use that with a $ appended". > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
quote-less quoting :-) Robert Treat On Mon, 2003-09-15 at 02:38, Andrew Dunstan wrote: > > ----- Original Message ----- > From: "Tom Lane" <tgl@sss.pgh.pa.us> > > > > Hannu Krosing <hannu@tm.ee> writes: > > > Tom Lane kirjutas P, 14.09.2003 kell 18:58: > > >> Those seem pretty unmemorable and content-free, though. Any other > ideas > > >> out there? > > > > > Considering that we use $$ instead of quotes we could call it dollarring > > > instead of quoting ;) > > > ... > > > double-dollar quoting feels good to say aloud ... > > > > Hm, "dollar quoting" doesn't sound too bad. I could go with that, > > unless someone has a better idea? > > > > I would prefer a name that says what it does rather than what it looks like. > But I can't think of a good one :-) 'Unescaped quoting' is more or less what > it does, but while not content-free :-) is less than memorable. *shrug* I > don't care that much that I want to spend lots of time thinking about it. > > cheers > > andrew > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
Jon Jensen wrote: >On Sat, 13 Sep 2003, Miko O'Sullivan wrote: > > > >>jon@endpoint.com (Jon Jensen) wrote in message news:<Pine.LNX.4.58.0309111950010.7142@louche.swelter.net>... >> >> >>>INSERT INTO sometable (5, <<\. >>>a >>>very long >>>string >>>\. >>>); >>> >>> >>I'm delighted to hear that here docs are being discussed for postgres. >> In the world of Perl here docs make life SO MUCH easier by allowing >>for independent quoting scopes. Jon, I'm totally in agreement with you >>that here docs would be a valuable addition to postgres. I'd like to >>propose a few modifications on the syntax you suggest above. >> >>In Perl, the rule for here docs is NOT "it starts immediately after >>the definition of the terminator" (as in your example). The rule is >>"it starts on the first line after the end of the command in which the >>terminator appears". There's a very good reason for this distinction. >> That reason is that it allows the command to be succinct, usually a >>single line, and the here doc stays completely separate. So, your >>example would like this instead: >> >> INSERT INTO sometable (5, <<'(END)'); >> a >> very long >> string >> (END) >> >>Notice that the entire INSERT command is on one line. Very easy to >>read. >> Well we might also have something like CREATE FUNCTION ... AS QUOTE('MyEndMarker') .... all the stuff ... MyEndMarker LANGUAGE 'plpgsql'; This looks quite SQL-like and should be easily implementable. Anyway, this Perl-like style or Tom's proof-of-concept dollar-quoting one, both seem equally usable, now let's have someone roll a die to decide. Regards, Andreas
Jon Jensen <jon@endpoint.com> writes: > On Sat, 13 Sep 2003, Miko O'Sullivan wrote: >> In Perl, the rule for here docs is NOT "it starts immediately after >> the definition of the terminator" (as in your example). The rule is >> "it starts on the first line after the end of the command in which the >> terminator appears". There's a very good reason for this distinction. >> That reason is that it allows the command to be succinct, usually a >> single line, and the here doc stays completely separate. > I'm aware of that distinction, and use it to advantage in Perl. It would > make parsing harder, though. Both Perl and shell are designed around the idea that most commands should fit on one line, so it makes sense for them to do things that way. I think SQL commands run to more than one line as often as not, so I'm not really enamored of keeping the next-line behavior of here documents for our purposes. Think of CREATE FUNCTION foo(int, text, ...) RETURNS text AS $BODY$LANGUAGE plpgsql IMMUTABLE RETURNS NULL ON NULL INPUT;... functiontext ...$BODY$ This is not lexing any more, it is some kind of weird action-at-a-distance interpretation of what you type. To understand it you have to match up the occurrence of $BODY$ with the later occurrence of a semicolon. And what happens if you don't want to type a semicolon? (For instance, in psql you might be wanting to use \g instead.) The other alternative is "it starts on the next physical line": CREATE FUNCTION foo(int, text, ...) RETURNS text AS $BODY$... function text ...$BODY$LANGUAGE plpgsql IMMUTABLE RETURNS NULLON NULL INPUT; which is probably more workable from an implementation point of view, but it really doesn't seem to offer any notational advantage over just letting the quoted text start where the quote marker is. A more principled argument is that SQL syntax is generally not dependent on newlines and we shouldn't make it so. The original here-document design made sense in the context of shell scripts, wherein newlines definitely do have syntactic meaning. The fact that Perl picked it up without changing it doesn't impress me a lot --- Larry Wall's taste in lexical design is, um, eclectic, but not necessarily something to emulate. > In the end it looks like consensus has settled on a new syntax with some > of the benefits of here documents without the requirement that the end > token be on its own blank line (and thus necessitating end-of-line checks) > and perhaps more importantly, not requiring that everything quoted in a > here document always end with a newline. Yeah, one thing I like about the dollar-quoting proposal is you can easily control whether there are leading or trailing newlines in the string. This is not very important for function definitions, but it makes the feature a lot more useful for other quoted strings. regards, tom lane
On Mon, 15 Sep 2003, Tom Lane wrote: > >> In Perl, the rule for here docs is NOT "it starts immediately after > >> the definition of the terminator" (as in your example). The rule is > >> "it starts on the first line after the end of the command in which the > >> terminator appears". There's a very good reason for this distinction. > >> That reason is that it allows the command to be succinct, usually a > >> single line, and the here doc stays completely separate. > > > I'm aware of that distinction, and use it to advantage in Perl. It would > > make parsing harder, though. > > Both Perl and shell are designed around the idea that most commands > should fit on one line, so it makes sense for them to do things that > way. Oh, actually, I misread his statement, which is not correct. In Perl, at least, the here document does not start after the current statement ends. It starts on the line right after the one that <<TOKEN appears on, regardless whether the statement is complete or not, just like you described next: > The other alternative is "it starts on the next physical line": > > CREATE FUNCTION foo(int, text, ...) RETURNS text AS $BODY$ > ... function text ... > $BODY$ > LANGUAGE plpgsql IMMUTABLE RETURNS NULL ON NULL INPUT; > > which is probably more workable from an implementation point of view, > but it really doesn't seem to offer any notational advantage over just > letting the quoted text start where the quote marker is. So that's why I leaned strongly toward here documents in the first place. I think they're a pretty good match. But ... > A more principled argument is that SQL syntax is generally not dependent > on newlines and we shouldn't make it so. I've been won over to this argument. > The original here-document > design made sense in the context of shell scripts, wherein newlines > definitely do have syntactic meaning. The fact that Perl picked it up > without changing it doesn't impress me a lot --- Larry Wall's taste in > lexical design is, um, eclectic, but not necessarily something to emulate. I love Perl's lexical design, though I recognize not everyone shares my opinion. :) In any case, SQL is a lot different than Perl. And controlling whether there's a final newline in the quoted string is much more important in a stream of SQL statements than in a programming language where you can munge the newline off the end if you really need to. Jon
Jan Wieck <JanWieck@Yahoo.com> writes: > Can we add digits to the allowed character set of the marker? We can't allow a digit right after the opening dollar sign; that would look like a parameter. I suppose we could allow $identifier$ though. That might be the easiest compromise. regards, tom lane
Tom Lane wrote: >Jan Wieck <JanWieck@Yahoo.com> writes: > > >>Can we add digits to the allowed character set of the marker? >> >> > >We can't allow a digit right after the opening dollar sign; that would >look like a parameter. > >I suppose we could allow $identifier$ though. That might be the easiest >compromise. > > > Excellent! cheers andrew
> > I think calling it 'here-document' quoting is possibly unwise - it is > > sufficiently different from here documents in shell and perl contexts to > > make it confusing. > > I agree. I've tried to think of a better alternative name, but without > much success. > > > We could call it meta-quoting, or alternative quoting, maybe. > > Those seem pretty unmemorable and content-free, though. Any other ideas > out there? literal text blocks literal quotes literal text sections -sc -- Sean Chittenden
Andreas said >> Well we might also have something like CREATE FUNCTION ... AS QUOTE('MyEndMarker') .... all the stuff ... MyEndMarker LANGUAGE 'plpgsql'; This looks quite SQL-like and should be easily implementable. Anyway, this Perl-like style or Tom's proof-of-concept dollar-quoting one, both seem equally usable, now let's have someone roll a die to decide. >> But if THIS solution is implemented then QUOTE('MyEndMarker') and MyEndMarker become noise words and it seems the parser could just assume to handle everything between CREATE FUNCTION ... AS and LANGUAGE as the delimiters for a function definition. But as that's so simple I must be missing something. Rick
Richard Hall wrote: >But if THIS solution is implemented then >QUOTE('MyEndMarker') > and >MyEndMarker > become noise words and it seems the parser could just assume to handle everything between >CREATE FUNCTION ... AS > and >LANGUAGE > as the delimiters for a function definition. > >But as that's so simple I must be missing something. > > > I was not aware of this either until Tom pointed it out: There syntax allows the LANGUAGE clause to precede the AS clause. What is more, a quoting mechaninsm is needed in other contexts, although this is the most significant one. From what I can see Tom's suggestion has the concensus support. cheers andrew
"Richard Hall" <rhall@micropat.com> writes: > But if THIS solution is implemented then > QUOTE('MyEndMarker') > and > MyEndMarker > become noise words and it seems the parser could just assume to handle everything between > CREATE FUNCTION ... AS > and > LANGUAGE > as the delimiters for a function definition. > But as that's so simple I must be missing something. Well, for one thing, that approach means that LANGUAGE is the *only* terminating delimiter, which loses most of the benefit of the here-document-inspired approach --- you can't quote material that includes the word LANGUAGE, which among other things keeps you from nesting quoted blocks. Also, this approach is useless for quoting material in any context except CREATE FUNCTION. If we are going to have a special lexer mechanism, it'd be good to make it useful for creating quoted strings in all contexts. regards, tom lane