Re: Multiline plpython procedure - Mailing list pgsql-general

From Marco Colombo
Subject Re: Multiline plpython procedure
Date
Msg-id Pine.LNX.4.61.0501211123130.4205@Megathlon.ESI
Whole thread Raw
In response to Re: Multiline plpython procedure  (Greg Stark <gsstark@mit.edu>)
Responses Re: Multiline plpython procedure
Re: Multiline plpython procedure
List pgsql-general
On Fri, 21 Jan 2005, Greg Stark wrote:

>
> Marco Colombo <marco@esi.it> writes:
>
>> Exaclty. Or, one could say: the "standard" text format is the one the
>> platform you are running on dictates. Which is what python does.
>
> Egads. So the set of valid Python programs is different depending on what
> platform you're on? That's just, uhm, insane. So essentially Python isn't a
> single language, it's a set of languages, Python-NL, Python-NLCR, Python-CR,
> (and in theory others).

No. Just any other application that reads text files, it reads text files.
That simple. It's unfortunate that 'textfile' means different things
on different platforms.

> So if I generate a database with a Python-CRNL function on windows, then
> pg_dump it and load it on Unix the function won't run because it's the wrong
> language, Unix only supports Python-NL.
>
> I don't think it's reasonable for pg_dump to think about converting data from
> one language to another. It's important for pg_dump to restore an identical
> database. Having it start with special case data conversation from one flavour
> to another seems too dangerous.

Makes no sense. pg_dump already make a lot of conversions: from internal
representation (which may be platform dependent) to some common format,
say text. It's just multi-line text which is a hard to deal with, because
there _no_ single format for it. pg_dump may just choose one format, and
stick with it. Every dump/restore will work. You may have trouble editing
a text dump, but that's another matter. BTW, what pg_dump does on windows?
I mean with -F p. Does it produce a text file with CRNL line seperator?
What happens if you feed that file to psql on a Unix box?
I've tried (adding spurious CRs) on Unix, and I think SQL treats CR as
whitespace so it's no issue. But what for opposite? Is psql on Windows
able to recognize SQL scripts made on Unix? (I can't try this).

Anyway, think of floats. If you want do to FP maths fast, you need to use
the native format supported by the CPU. When you dump, you get a text
form of the FP number, and when you restore on a different platform you
may get a _different_ number. And you have to live with it. Kiss goodbye
to your "indentical database".

> Incidentally, are we sure we've diagnosed this correctly? I'm discussing this
> with some Python developers and they're expressing skepticism. One just tried
> a quick test with a Python program containing a mixture of all three newline
> flavours and it ran fine.

Recent python has universal newline support. It works for files, and it's
enabled by default when it read source files. But it's NOT part of the parser,
AFAIK, and the source file gets converted to UNIX format before being
fed to the parser (lexxer). Problem is that I'm not sure that's the way
python is used by PostgreSQL. It works only when the program is read from
a file. That's what the guy tested, probably. If you build a program, put
it in a string, and invoke the parser, the string must be in Unix format.

I'm for defining a format used by PostgreSQL, and force the python parser
into accepting it on all platforms. That is, let's set the rule that
python programs to be embedded into PostgreSQL use \n as line termination.

Think of this: tomorrow we meet people from Mars. One of them really likes
PostgreSQL, and ports it to their platform. Being a martian platform, it
uses a different text file format. Line separator there is the first 1000
binary digits of PI. When he writes a small python function on his client
and tries to have it run on a server on Earth, it fails, cause the python
parser here won't handle PI-terminated lines correctly. What would you do?
Bug python developers because "python it's a set of languages, Python-Earth,
Python-Mars, Python-Venus (and in theory others)"? (BTW, in that situation,
I bet Perl would fail as well). Or would you ask the martian guy to add,
as part of his port effort, support for the martian line format to PostgreSQL,
so that the server can convert the python program to Earth format before
feeding it to python? Or, alternatively, just tell him: python programs
in PostgreSQL are \n terminated? Which one is the simplest?

.TM.
--
       ____/  ____/   /
      /      /       /            Marco Colombo
     ___/  ___  /   /              Technical Manager
    /          /   /             ESI s.r.l.
  _____/ _____/  _/               Colombo@ESI.it

pgsql-general by date:

Previous
From: Tino Wildenhain
Date:
Subject: Re: Multiline plpython procedure
Next
From: Abdul-Wahid Paterson
Date:
Subject: custom integrity check