Re: Re: COPY BINARY file format proposal - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Re: COPY BINARY file format proposal |
Date | |
Msg-id | 9435.976323300@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Re: COPY BINARY file format proposal (Philip Warner <pjw@rhyme.com.au>) |
Responses |
Re: Re: COPY BINARY file format proposal
|
List | pgsql-hackers |
Philip Warner <pjw@rhyme.com.au> writes: > How about a CRC? ;-P I take it from the smiley that you're not serious, but actually it seems like it might not be a bad idea. I could see appending a CRC to each tuple record. Comments anyone? You seemed to like the PNG philosophy of using feature flags rather than a version number. Accordingly, I propose dropping the version number field in favor of a flags word. (Which was needed anyway, because I had *again* forgotten about COPY WITH OIDS :-(.) Attached is the current state of the proposal. I haven't added a CRC field but am willing to do so if that's the consensus. regards, tom lane COPY BINARY file format proposal The objectives of this change are: 1. Get rid of the tuple count at the front of the file. This requires an extra pass over the relation, which is a lot more trouble than the count is worth. Use an explicit EOF marker instead. 2. Send fields of a tuple individually, instead of dumping out raw tuples (complete with alignment padding and so forth) as is currently done. This is mainly to simplify TOAST-related processing. 3. Make the format somewhat self-identifying, so that the reader has at least some chance of detecting it when the data doesn't match the table it's supposed to be loaded into. The proposed format consists of a file header, zero or more tuples, and a file trailer. File Header ----------- The proposed file header consists of 24 bytes of fixed fields, followed by a variable-length header extension area. Signature: 12-byte sequence "PGBCOPY\n\377\r\n\0" --- note that the null is a required part of the signature. (The signature is designed to allow easy identification of files that have been munged by a non-8-bit-clean transfer. The proposed signature will be changed by newline-translation filters, dropped nulls, dropped high bits, or parity changes.) Integer layout field: int32 constant 0x01020304 in source's byte order. Potentially, a reader could engage in byte-flipping of subsequent fields if the wrong byte order is detected here. Flags field: a 4-byte bit mask to denote important aspects of the file format. Bits are numbered from 0 (LSB) to 31 (MSB) --- note that this field is stored with source's endianness, as are all subsequent integer fields. Bits 16-31 are reserved to denote critical file format issues; a reader should abort if it finds an unexpected bit set in this range. Bits 0-15 are reserved to signal backwards-compatible format issues; a reader should simply ignore any unexpected bits set in this range. Currently only one flag bit is defined, and the rest must be zero:Bit 16: if 1, OIDs are included in the dump; if 0, not Next 4 bytes: length of remainder of header, not including self. In the initial version this will be zero, and the first tuple follows immediately. Future changes to the format might allow additional data to be present in the header. A reader should silently ignore any header extension data it does not know what to do with. Note that I envision the content of the header extension area as being a sequence of self-identifying chunks (but the specific design of same is postponed until we need 'em). The flags field is not intended to tell readers what is in the extension area. This design allows for both backwards-compatible header additions (add header extension chunks, or set low-order flag bits) and non-backwards- compatible changes (set high-order flag bits to signal such changes, and add supporting data to the extension area if needed). Tuples ------ Each tuple begins with an int16 count of the number of fields in the tuple. (Presently, all tuples in a table will have the same count, but that might not always be true.) Then, repeated for each field in the tuple, there is an int16 typlen word possibly followed by field data. The typlen field is interpreted thus: Zero Field is NULL. No data follows. > 0 Field is a fixed-length datatype. Exactly N bytes of data follow the typlen word. -1 Field is a varlena datatype. The next four bytes are the varlena header, which contains the totalvalue length including itself. < -1 Reserved for future use. For non-NULL fields, the reader can check that the typlen matches the expected typlen for the destination column. This provides a simple but very useful check that the data is as expected. There is no alignment padding or any other extra data between fields. Note also that the format does not distinguish whether a datatype is pass-by-reference or pass-by-value. Both of these provisions are deliberate: they might help improve portability of the files (although of course endianness and floating-point-format issues can still keep you from moving a binary file across machines). If OIDs are included in the dump, the OID field immediately follows the field-count word. It is a normal field except that it's not included in the field-count. In particular it has a typlen --- this will allow handling of 4-byte vs 8-byte OIDs without too much pain, and will allow OIDs to be shown as NULL if we someday allow OIDs to be optional. File Trailer ------------ The file trailer consists of an int16 word containing -1. This is easily distinguished from a tuple's field-count word. A reader should report an error if a field-count word is neither -1 nor the expected number of columns. This provides a pretty strong check against somehow getting out of sync with the data.
pgsql-hackers by date: