Thread: newline conversion in SQL command strings

newline conversion in SQL command strings

From
Peter Eisentraut
Date:
I have received a number of bug reports about plsh choking on
Windows-style line endings.  The problem is that the user uses some
Windows-based tool or other to execute an SQL command line this:

CREATE FUNCTION foo() RETURNS something<CR><LF>
LANGUAGE plsh<CR><LF>
AS $$<CR><LF>
#!/bin/sh<CR><LF>
<CR><LF>
do something<CR><LF>
do something<CR><LF>
$$;<CR><LF>

which (apparently, I don't have Windows handy) creates a function with
the prosrc body of

'<CR><LF>
#!/bin/sh<CR><LF>
<CR><LF>
do something<CR><LF>
do something<CR><LF>
'

But executing this fails because Unix shells reject <CR> characters in
inappropriate places as syntax errors.

I don't know how to handle that.  It would be unfortunate to have the
behavior of a function depend on the kind of client used to create or
modify it.

I suppose the other more mainstream PLs don't expose that problem so
much because the interpreters can handle either kind of line ending.
But there could still be functional differences if for example a line
ending is embedded into a string constant.

Ideas?





Re: newline conversion in SQL command strings

From
Heikki Linnakangas
Date:
On 20.09.2012 05:56, Peter Eisentraut wrote:
> I have received a number of bug reports about plsh choking on
> Windows-style line endings.  The problem is that the user uses some
> Windows-based tool or other to execute an SQL command line this:
>
> CREATE FUNCTION foo() RETURNS something<CR><LF>
> LANGUAGE plsh<CR><LF>
> AS $$<CR><LF>
> #!/bin/sh<CR><LF>
> <CR><LF>
> do something<CR><LF>
> do something<CR><LF>
> $$;<CR><LF>
>
> which (apparently, I don't have Windows handy) creates a function with
> the prosrc body of
>
> '<CR><LF>
> #!/bin/sh<CR><LF>
> <CR><LF>
> do something<CR><LF>
> do something<CR><LF>
> '
>
> But executing this fails because Unix shells reject<CR>  characters in
> inappropriate places as syntax errors.
>
> I don't know how to handle that.  It would be unfortunate to have the
> behavior of a function depend on the kind of client used to create or
> modify it.

Could you strip the CRs? Either at CREATE FUNCTION time, or when the 
function is executed.

- Heikki



Re: newline conversion in SQL command strings

From
Peter Eisentraut
Date:
On 9/20/12 2:01 AM, Heikki Linnakangas wrote:
> Could you strip the CRs? Either at CREATE FUNCTION time, or when the
> function is executed.

It has been proposed that the plsh handler should strip the CRs before
execution.  But I don't think that is a correct solution, because that
is user data which could be relevant.  It could be the case, for
example, that plsh is run on a Windows host with a particular shell that
requires CRLF line endings.



Re: newline conversion in SQL command strings

From
Andrew Dunstan
Date:
On 09/20/2012 09:12 AM, Peter Eisentraut wrote:
> On 9/20/12 2:01 AM, Heikki Linnakangas wrote:
>> Could you strip the CRs? Either at CREATE FUNCTION time, or when the
>> function is executed.
> It has been proposed that the plsh handler should strip the CRs before
> execution.  But I don't think that is a correct solution, because that
> is user data which could be relevant.  It could be the case, for
> example, that plsh is run on a Windows host with a particular shell that
> requires CRLF line endings.
>
>

I confess I find it hard to take plsh terribly seriously, it just seems 
like a bad solution for practically any problem, when compared with, 
say, plperlu, and it seems so Unixish that the very idea of using it at 
all on Windows strikes me as crazy. Having said that, and devoted all of 
5 minutes to looking at the code, I suggest that you make two changes:
       if (sourcecode[0] == '\n')           sourcecode++;

would become:
       if (sourcecode[0] == '\n' || (sourcecode[0] == '\r' && sourcecode[1] == '\n'))           sourcecode++;

and
       len = strcspn(rest, "\n");

would become:
       len = strcspn(rest, "\r\n");



Also, there is probably a pretty good case for opening the temp file using PG_BINARY_W rather than "w" so that Windows
doesn'tproduce extra CRs for you (see recent discussion of this w.r.t. pg_upgrade).
 


cheers

andrew





Re: newline conversion in SQL command strings

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 09/20/2012 09:12 AM, Peter Eisentraut wrote:
>> It has been proposed that the plsh handler should strip the CRs before
>> execution.  But I don't think that is a correct solution, because that
>> is user data which could be relevant.  It could be the case, for
>> example, that plsh is run on a Windows host with a particular shell that
>> requires CRLF line endings.

> I confess I find it hard to take plsh terribly seriously,

Perhaps, but the general problem remains, for any language whose
interpreter is unforgiving about line endings.

However, I think the problem is insoluble as Peter has stated it.
His original complaint was 

> ... It would be unfortunate to have the
> behavior of a function depend on the kind of client used to create or
> modify it.

but I don't see a way to reconcile that with the notion that CRs might
be user data.  Either you think they're significant, in which case you
should retain the function body as presented, or you think they aren't,
in which case you might as well strip them.

I lean to the position that you're best off stripping them, given that
the lines are to be fed to a Unix shell.  The argument about some other
shell possibly requiring them seems pretty tenuous.  (IOW, I think that
the PL-specific code is exactly the place to be making such decisions.
I don't want Postgres' generic function support making any decisions
about whether CRs in function bodies are significant, because that
decision could be language-specific.)
        regards, tom lane



Re: newline conversion in SQL command strings

From
Andrew Dunstan
Date:
On 09/20/2012 03:34 PM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 09/20/2012 09:12 AM, Peter Eisentraut wrote:
>>> It has been proposed that the plsh handler should strip the CRs before
>>> execution.  But I don't think that is a correct solution, because that
>>> is user data which could be relevant.  It could be the case, for
>>> example, that plsh is run on a Windows host with a particular shell that
>>> requires CRLF line endings.
>> I confess I find it hard to take plsh terribly seriously,
> Perhaps, but the general problem remains, for any language whose
> interpreter is unforgiving about line endings.


plsh is the only one I know of that writes the script out to a temp 
file, and it opens that file on Windows in text mode (if I'm looking at 
the right source), which will convert LF into CRLF. Hence my suggestion 
that he change that. Of course, if the function body actually has 
embedded CRs then there is a different set of problems.

cheers

andrew