First draft of new FE/BE protocol spec posted for comments - Mailing list pgsql-interfaces

From Tom Lane
Subject First draft of new FE/BE protocol spec posted for comments
Date
Msg-id 10770.1050448110@sss.pgh.pa.us
Whole thread Raw
Responses Re: First draft of new FE/BE protocol spec posted for comments
List pgsql-interfaces
I have committed a first-draft revision of the FE/BE protocol document;
you can read it at
http://candle.pha.pa.us/main/writings/pgsql/sgml/protocol.html
or in a few hours at
http://developer.postgresql.org/docs/postgres/protocol.html
I'd appreciate it if people would look it over for both presentation
and content.

There are a couple of loose ends that are still bothering me --- please
comment:

The new Execute command (part of the extended query protocol) has a
field saying whether to return data in text or binary format.  When
retrieving from a cursor, it is not clear whether this should override
the declaration of the cursor (BINARY or not).  I am inclined to think
that it should, but a possible compromise is to add a third value of the
field meaning "don't care", in which case you'd get back text in all
cases except when reading a cursor declared BINARY.  This would be
strictly for backwards compatibility though, and so maybe it doesn't
matter.  Old apps will probably be going through the simple-Query
interface, which will give them the old behavior.

I have dropped the CursorResponse message from the protocol, as it
didn't seem to be doing anything useful; does anyone care about it?

The document as it stands is a little bit schizoid about binary data
formats.  The new message types I've added are currently specified to
use a representation that matches the COPY BINARY file format: an int16
typlen (replaced with 0 if NULL), followed by a field value, where if
typlen = -1 the first four bytes of the field value are self-inclusive
length.  The existing message types that handle binary data (BinaryRow,
FunctionCall, FunctionResult) do it differently: a physical length (not
counting self) followed by data.  This is a bit of a mess, and I think
it would make sense to standardize the representation one way or the
other.  The reason I'm inclined to move away from the old representation
is that it's effectively broken on machines where MAXALIGN is greater
than four: because it strips the length word out of the "contents" of
varlena datatypes, the remainder of the varlena is not correctly aligned
when stored in libpq memory.  (The mail list archives seem to be down at
the moment, so I can't give a URL, but there was a discussion of this
point in pgsql-hackers on 2-Aug-99.)  This will clearly be a
user-visible change for people using binary cursors, but I think we
*must* change it now or be stuck with the old mistake forever.

An alternative approach, assuming we get as far as implementing
architecture-independent binary representations, is to change the COPY
BINARY file format to use them, and then the issue largely goes away ---
we can stick with the existing layout for BinaryRow and make the other
FE/BE messages use a similar format.  But in either case, something
breaks --- either binary cursors or COPY BINARY files.  Any preference
which to break?
        regards, tom lane



pgsql-interfaces by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] More thoughts about FE/BE protocol
Next
From: Wei Weng
Date:
Subject: Re: [HACKERS] More thoughts about FE/BE protocol