Re: Multiline plpython procedure - Mailing list pgsql-general

From Marco Colombo
Subject Re: Multiline plpython procedure
Date
Msg-id Pine.LNX.4.61.0501191120290.17420@Megathlon.ESI
Whole thread Raw
In response to Re: Multiline plpython procedure  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Multiline plpython procedure  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general
On Tue, 18 Jan 2005, Tom Lane wrote:

> Michael Fuhr <mike@fuhr.org> writes:
>> http://docs.python.org/ref/physical.html
>
>> "A physical line ends in whatever the current platform's convention
>> is for terminating lines.  On Unix, this is the ASCII LF (linefeed)
>> character.  On Windows, it is the ASCII sequence CR LF (return
>> followed by linefeed).  On Macintosh, it is the ASCII CR (return)
>> character."
>
> Seems like Guido has missed a bet here: namely the case of a script
> generated on one platform and fed to an interpreter running on another.

I think you're missing that vendors define what a 'text file' is on their
platform, not Guido. Guido just says that a Python program is a text file,
which is a very sound decision, since it makes perfectlty sense to be able
to edit it with native tools (text editors which do not support alien
textfile formats).
What you seem to be missing is that before scripts are "fed to interpreter
running on another [platform]" they need to be transferred there!
Conversion must happen (if necessary) at that point. That's why the
2000 years old protocol FTP (well, maybe not 2000 years but it _is_ old)
has an ASCII transfer mode. Is this situation unfortunate? Yes. Every
(programming) language is (or should be) affected by the same problem,
since I expect the source file being a _text_ file, everywhere.
A \n line-terminated file is not a text file under Windows, per specs.
A \r\n line-terminated is not a text file under Unix, per specs.
A \r line-terminated is not a text file neither Win or Unix, per specs.
(I'm not sure what the specs are under Mac).
Those are facts of life you have to deal with everytime you move a text
file (such as the source of a program) from platform X to platform Y.
It may affect a Cobol or Lisp or C compiler as well.

> If I were designing it, I would say that any Python interpreter should
> take all three variants no matter which platform the interpreter itself
> is sitting on.  Or is cross-platform support not a Python goal?

Changing what a text file is under all platforms Python aims to run on
is not. (I can't speak for Guido of course, but I'm pretty sure it isn't).
I'm not against your suggestion, but that won't help with the simple fact
that text files need to be converted to what the platform they sit on
defines a text file to be. Otherwise, many other native tools fail in
treating them as text file.

> In short, any bug report on this ought to go to the Python project.

Definitely not a bug report for Python. It seems to me is works as expected
(that is, as documented). The bug is on the application that transferred
the text file over the wire from platform X to platform Y.

OTOH, it's true that on the client-server "wire" no text file is tranferred,
strictly speaking. Just a string which happens to be a valid python
program. Moreover, Python is used more like an embedded scripting language
(not as a standalone programming language). So you're right when you expect
it to be more tolerant.

This is a grey area. It is pretty clear that a text file is a sequence
of lines: the separator is platform specific but the user/application
becomes aware of it only when the text file is accessed as in binary mode
(with some quirks... most native unix tool will precisely that,
think of md5sum, since there's no way to recognise a "text file").
It happens that when a text file is _correctly_ accessed, the platform
hides the separator (or should do).

For "multiline strings" (which is the right data type for a python
embedded script), everything is just worse (there's nothing even close
to a vague definition). IMHO, everytime such a string is handed to a
native tool, it should be converted to the platform specific multiline
format (that is, with the right separator). You shouldn't expect the
external tool to be able to cope with alien line formats.
Alternatively, you should _define_ what the separator is for python
embedded script in PostgreSQL, and have the interpreter accept it on
every platform unconditionally (I'm not sure whether this is easy or not).

Just my 0.03 eurocents.
.TM.
--
       ____/  ____/   /
      /      /       /            Marco Colombo
     ___/  ___  /   /              Technical Manager
    /          /   /             ESI s.r.l.
  _____/ _____/  _/               Colombo@ESI.it

pgsql-general by date:

Previous
From: Clive Page
Date:
Subject: Re: Postgres crashed when adding a sequence column
Next
From: "J. Greenlees"
Date:
Subject: Re: what happened to the website?