Re: plpython function problem workaround - Mailing list pgsql-general

From Michael Fuhr
Subject Re: plpython function problem workaround
Date
Msg-id 20050316181321.GA35801@winnie.fuhr.org
Whole thread Raw
In response to Re: plpython function problem workaround  (Michael Fuhr <mike@fuhr.org>)
Responses Re: plpython function problem workaround  (Marco Colombo <pgsql@esiway.net>)
List pgsql-general
[I've changed the Subject back to the thread that started this
discussion.]

On Wed, Mar 16, 2005 at 05:52:02PM +0100, Marco Colombo wrote:

> I'm against to any on-the-fly conversion, now.
> I don't like the idea of PostgreSQL accepting input in one form
> (\r\n) and providing output in a different form (\n). Also think of
> a function definition with mixed \r\n and \n lines: we'd have no way
> to reconstruct the original input.

Yeah, that's a reasonable argument against modifying the function
source code before storing it in pg_proc.  But I expect this problem
will come up again, and some people might not care about being able
to reconstruct the original input if it's just a matter of stripped
carriage returns, especially if the function logic doesn't use
literal carriage return characters that would be missed.  For those
people, the validator hack might be an acceptable way to deal with
a client interface that inserts carriage returns that the programmer
didn't intend anyway.  Not necessarily as part of the core PostgreSQL
code or even distributed with PostgreSQL, but as something they
could install if they wanted to.

> I think we should just state that text used for function definitions
> is \n-delimited.  Some languages may accept \r\n as well, but that's
> undocumented side effect, and bad practice.

Whether it's an "undocumented side effect" depends on the language,
and whether it's bad practice is a matter of opinion.  In any case,
that's the language's concern and not something PostgreSQL should
judge or enforce.  PostgreSQL shouldn't have to know or care about a
procedural language's syntax -- a function's source code should be an
opaque object that PostgreSQL stores and passes to the language's
handler without caring about its contents.  Syntax enforcement should
be in the language's validator or handler according to the language's
own rules.

Speaking of code munging and syntax enforcement, have a look at this:

CREATE FUNCTION foo() RETURNS text AS $$
return """line 1
line 2
line 3
"""
$$ LANGUAGE plpythonu;

SELECT foo();
           foo
--------------------------
 line 1
        line 2
        line 3

(1 row)

Eh?  Where'd those leading tabs come from?  Why, they came from
PLy_procedure_munge_source() in src/pl/plpython/plpython.c:

    mrc = PLy_malloc(mlen);
    plen = snprintf(mrc, mlen, "def %s():\n\t", name);
    Assert(plen >= 0 && plen < mlen);

    sp = src;
    mp = mrc + plen;

    while (*sp != '\0')
    {
        if (*sp == '\n')
        {
            *mp++ = *sp++;
            *mp++ = '\t';
        }
        else
            *mp++ = *sp++;
    }
    *mp++ = '\n';
    *mp++ = '\n';
    *mp = '\0';

How about them apples?  The PL/Python handler is already doing some
fixup behind the scenes (and potentially causing problems, as the
example illustrates).

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

pgsql-general by date:

Previous
From: Adam Treat
Date:
Subject: dataKiosk 0.6 released
Next
From: Michael Fuhr
Date:
Subject: Re: generating statistics