Re: Multiline plpython procedure - Mailing list pgsql-general

From Marco Colombo
Subject Re: Multiline plpython procedure
Date
Msg-id Pine.LNX.4.61.0501211807080.4205@Megathlon.ESI
Whole thread Raw
In response to Re: Multiline plpython procedure  (Bruno Wolff III <bruno@wolff.to>)
List pgsql-general
On Fri, 21 Jan 2005, Bruno Wolff III wrote:

> On Fri, Jan 21, 2005 at 12:02:09 +0100,
> If you are going to another system that uses the same floating point
> representation, you should get the same number. pg_dump writes out
> enough digits that the exact number can be recovered when the dump
> has been reloaded. This has been the case since 7.3.
>
> If you move the data to a machine with a different floating point
> representation you might get a different number even if the original number
> could be represented exactly in the new representation.

So the same pg_dump file _may_ lead to different databases on different
platforms, even right now. So the issue of 'identical' databases is
not serious.

Note that the float case is worse than the multiline text one. With multiline
text there is always a way to convert it w/o loss or change of information.
All you need is to treat it as a "sequence of lines".
"sequences of lines" are differently represented on different platforms,
just like floats. Whenever you use it on a platform, use the platform
dependent format. When you export it, use a defined external format.
On import, convert the defined external format to the internal one.
Does this lead to different internal formats on different platforms?
Yeah but what's the problem?

The problem originated from a Windows application storing a multiline
text (python function body) and this text being handed to a unix
program that expect a multiline text as input (the python interpreter).
This is a particular case only. _Any_ windows client that inserts
a multiline text is likely to use \r\n as separator, while any unix
client is likely to insert text with \n. For the same input (same
sequence of lines typed by the user), the result is different.
There's no way to write a server-side application that handles that
correctly, right now. Of course 'Hello\r\nWorld\r\n' is different
from 'Hello\n\World\n', as far as the server is concerned. But if
you think of what the users typed, you realize they should be equal.
It _is_ the same data (line 1 is 'Hello' and line 2 is 'World'), just
in different formats. Either the client library should handle that
transparently (converting to an on-the-wire format), or the server should
be aware of what convention the client is using.

Right now the application developer should take care of it, since
PostgreSQL (including client library) treats text as opaque binary data.

(I'm not arguing we should change that. I'm just saying it's not a python bug.)

.TM.
--
       ____/  ____/   /
      /      /       /            Marco Colombo
     ___/  ___  /   /              Technical Manager
    /          /   /             ESI s.r.l.
  _____/ _____/  _/               Colombo@ESI.it

pgsql-general by date:

Previous
From: Geoffrey
Date:
Subject: Re: Best Linux Distribution
Next
From: "Bruno Almeida do Lago"
Date:
Subject: Re: Best Linux Distribution