Re: Re: COPY BINARY file format proposal - Mailing list pgsql-hackers

From Philip Warner
Subject Re: Re: COPY BINARY file format proposal
Date
Msg-id 3.0.5.32.20001207133403.02d6acc0@mail.rhyme.com.au
Whole thread Raw
In response to Re: Re: COPY BINARY file format proposal  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Re: COPY BINARY file format proposal  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
At 21:12 6/12/00 -0500, Tom Lane wrote:
>Philip Warner <pjw@rhyme.com.au> writes:
>>> OK, we can do it that way.  I'm still going to pick a magic number that
>>> looks different depending on endianness, however ;-).
>
>> What does the smiley mean in this context?
>
>Just thinking that the only way an endianness flag inside the header
>would be useful is if we pick a magic number that's a bytewise
>palindrome.

You could just read the 1st, 2nd, 3rd, etc bytes and require that they be
'P', 'G', 'C', 'P', 'Y' or some such. I *think* reading five bytes and
doing a strcmp works...ie. don't rely on the integer value, use a string.


>> - floating point representation (for portability)
>
>Specified how?  (For that matter, determined how?)

I'd recommend a crystal ball. You did ask a question about the future ;-}.


>> - flag for compressed or uncompressed toast fields (I assume you dump them
>> uncompressed?)
>
>Yes, I want COPY to force 'em to uncompressed so as to avoid problems
>with cross-version changes of compression algorithm.  (Right at the
>moment it gets that wrong.)

Sounds reasonable, but there could be an advantage in allowing a binary
compressed dump for short-term work.


>> - version number may be important if we dump a subset of fields (ie. we'll
>> need to store the field names somewhere).
>
>No we don't.  ASCII COPY format doesn't store field names either ... at
>least not as part of the data stream ... and should not IMHO.  Don't you
>want to be able to reload into a table that you've changed the column
>names of?

This is essential if we ever allow subsets of columns - even if it is only
for displaying information to the user. If I dump 5 out of 7 columns then
rename half of them, I'd say I'm asking for trouble. At least with the
names available, you have a chance of working out what goes where. But
again, without copy-a-subset-of-columns, this also requires a crystal ball.


It all gets back to whether it's a good idea to overload a magic number. 



----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Re: COPY BINARY file format proposal
Next
From: Philip Warner
Date:
Subject: Re: Re: COPY BINARY file format proposal