Re: libpq: Process buffered SSL read bytes to support records >8kB on async API - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: libpq: Process buffered SSL read bytes to support records >8kB on async API
Date
Msg-id CAHgHdKseseJG0j_96xrOuP3LgubvGScvXsQyPCYWANnBxTL4tA@mail.gmail.com
Whole thread
In response to Re: libpq: Process buffered SSL read bytes to support records >8kB on async API  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: libpq: Process buffered SSL read bytes to support records >8kB on async API
List pgsql-hackers


On Fri, Dec 5, 2025 at 5:32 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 01/08/2025 00:48, Jacob Champion wrote:
> On Fri, Jul 18, 2025 at 11:11 AM Jacob Champion
> <jacob.champion@enterprisedb.com> wrote:
>> The attached still needs some documentation work
>
> v2 does a bunch of commit message work, but I imagine it needs a good
> bit of copy-editing for clarity.
>
> I'm not in any hurry to smash this in. I think we still need
> - independent verification of the architectural issue, to make sure
> it's not any deeper or shallower than pqReadData()
> - independent verification that this fixes the bugs that have been described
> - measurement of the performance characteristics of the new code
> - verification of the maximum amount of additional buffer memory that
> can be consumed during the drain
> - consensus that we want to maintain this new behavior
> - discussion of what we want this code to look like going forward

I agree that making pqReadData() drain the transport buffer is the right
approach. I'm not sure if this can ever work with the OpenSSL readahead
that was discussed, but we don't need to solve that now.

Once we make pqReadData() to always drain the buffer, does
pqSocketCheck() still need to check for pending bytes? It seems harmless
to do it in any case, though, so probably best to keep it.

gss_read() calls pqReadReady() which now calls pqsecure_bytes_pending(),
which now also checks for bytes pending in the GSS buffer. Is that a
good thing or a bad thing or does it not matter?

In pqReadData we have this:

>       /*
>        * A return value of 0 could mean just that no data is now available, or
>        * it could mean EOF --- that is, the server has closed the connection.
>        * Since we have the socket in nonblock mode, the only way to tell the
>        * difference is to see if select() is saying that the file is ready.
>        * Grumble.  Fortunately, we don't expect this path to be taken much,
>        * since in normal practice we should not be trying to read data unless
>        * the file selected for reading already.
>        *
>        * In SSL mode it's even worse: SSL_read() could say WANT_READ and then
>        * data could arrive before we make the pqReadReady() test, but the second
>        * SSL_read() could still say WANT_READ because the data received was not
>        * a complete SSL record.  So we must play dumb and assume there is more
>        * data, relying on the SSL layer to detect true EOF.
>        */
>
> #ifdef USE_SSL
>       if (conn->ssl_in_use)
>               return 0;
> #endif

Should we do the same for GSS as we do for SSL here?


Apart from the above-mentioned things, the patch looks bug-free to me.
However, it feels like a layering violation. The *_drain_pending()
functions should not write directly to conn->inBuffer, or expand the buffer.

To avoid that, I propose the attached to move the buffer-expansin logic
to the caller. The pqsecure API now has this:

/*
  *  Return the number of bytes available in the transport buffer.
  *
  * If pqsecure_read() is called for this number of bytes, it's
guaranteed to
  * return successfully without reading from the underlying socket.  See
  * pqDrainPending() for a more complete discussion of the concepts
involved.
  */
ssize_t pqsecure_bytes_pending(PGconn *conn);

pqDrainPending() reads pending bytes into conn->inBuffer + conn->inEnd but never advances conn->inEnd.  Compare with pqReadData_internal, which always does conn->inEnd += nread after a successful read. Is pgDrainPending()'s behavior correct?  Sorry if I am misunderstanding.  Without that update, could the drained bytes occupy buffer memory that libpq doesn't know about?   Would the next call to pqReadData_internal overwrite those positions, silently losing the drained data?

In practice, I was unable to trigger this code path.  When the protocol message parser in fe-protocol3.c (around line 122) sees a large DataRow header, it pre-expands the input buffer via pqCheckInBufferSpace() to hold the entire message.  This ensures that the buffer always has enough free space for the next TLS record (max 16384 bytes plaintext), so SSL_read never returns a partial record and SSL_pending() is always 0 at the point pqDrainPending is called.  Perhaps you thought of that, and that's why you didn't bother to worry about this?

I suspect this is an unreachable bug, but perhaps it's just in need of some code comments.  What are your thoughts?
 
pqReadData(), or its pqDrainPending() subroutine to be precise, uses
that and pqsecure_read() to drain the buffer. This is less code because
it reuses pqsecure_read(), and doesn't require the TLS/GSS
implementation to reach out into connection->inBuffer.

This does add that assumption to pqsecure_read() that's mentioned in the
comment above though. If we want to avoid that assumption, we could add
another "pqsecure_read_pending()" function that's just like
pqsecure_read(), except that it would only read pending bytes and never
from the socket.

This is completely untested, I just wanted to show the internal design
for now.

- Heikki


--

Mark Dilger

pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: SQL Property Graph Queries (SQL/PGQ)
Next
From: Nisha Moond
Date:
Subject: Re: Use SIGTERM instead of SIGUSR1 for slotsync worker to exit during promotion?