Home > mailing lists

Re: libpq: Process buffered SSL read bytes to support records >8kB on async API - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: libpq: Process buffered SSL read bytes to support records >8kB on async API
Date	July 18 22:55:52
Msg-id	ghryfzhndzmr5g4fzhwiupvpyyw4mnuq6kiqte6jvmqduippzf@2lnipo7wkort Whole thread Raw
In response to	Re: libpq: Process buffered SSL read bytes to support records >8kB on async API (Jacob Champion <jacob.champion@enterprisedb.com>)
Responses	Re: libpq: Process buffered SSL read bytes to support records >8kB on async API
List	pgsql-hackers

Tree view

Hi,

On 2025-07-17 09:48:29 -0700, Jacob Champion wrote:
> On Wed, Jul 16, 2025 at 4:35 PM Andres Freund <andres@anarazel.de> wrote:
> > Why do we care about not hitting the socket? We always operate the socket in
> > non-blocking mode anyway?
>
> IIUC, that would change pqReadData() from a bounded read to an
> unbounded one. Then we have to somehow protect against a server that
> can send data faster than we can process it.

I don't see why it would have to. If a connection has sufficient data in its
buffer on the libpq level, we obviously don't need to read more.

> > FWIW, I have seen too many folks move away from encrypted connections for
> > backups due to openssl being the bottlenecks to just accept that we'll never
> > go beyond the worst performance.  Any actually decently performing networking
> > will need to divorce when socket IO happens and when openssl sees the data
> > much more heavily.
>
> Can't we make our custom BIO do that?

We can, but that actually just makes this issue *more* acute, not less. We
would obviously have to take the data buffered on the BIO level into account
before signalling to the libpq user that no data is ready.

FWIW, even with BIO level buffering, it turns out that openssl readahead is
*still* a decent performance boon, going twice as many times in the BIO isn't
free, even if the BIO does buffering.

> > Just relying more and more on extremely tight coupling is a dead end.
>
> Why? This abstraction _must_ leak.

I don't really buy this. We have buffering in libpq, even without TLS. Most of
the issues we need to solve already exists for non-TLS connections (except for
the annoying thing of introduces writes when doing reads and vice versa).

It's just that our implementation is bad.

> TLS does not behave like raw TCP. So if there's no way to get rid of
> the coupling, IMO we might as well use it.

I agree that they're not the same.  I don't see what that has to do with
introducing hard requirement for no buffering within openssl or layers below
(a future BIO that does caching).

> > > IIUC, we can't use SSL_has_pending() either. I need to know how
> > > exactly many bytes to drain out of the userspace buffer without
> > > introducing more bytes into it.
> >
> > Why?  The only case where we ought to care about whether pending data exists
> > inside openssl is if our internal buffer is either empty or doesn't contain
> > the entire message we're trying to consume. In either of those two cases we
> > can just consume some data from openssl, without caring how much it precisely
> > is.
>
> I can't tell if you're discussing the fix for this bug or a hypothetical
> future architecture that makes readahead safe.

I don't know your fix really looks like - afaict you haven't shared it. So
it's hard to actually comment with specifics to it.

I am not saying that a to-be-backpatched-fix needs to make openssl readahead
work, that'd be absurd. But I am concerned with more fundamentally entrenching
the idea that there never is any buffering below libpq's buffering, it'll
cause trouble down the line.

> I don't think PQconnectPoll() behaves the way you're describing; there is no
> retry-read concept.

FWIW, I don't care about what we do during connection establishment. Doing
lots of tiny reads or such will never matter for efficiency, compared to all
the other costs.

To me the relevant cases are use of libpq in established connections, either in a
blocking way, or in a non-blocking way.

For blocking use we have that gross SSL hack in place (i.e. the
pgtls_read_pending() check in pqSocketCheck()), but we obviously *don't* for
the non-blocking case - the problem of this thread.

Our APIs around the non-blocking use of libpq are pretty
bonkers:

- PQconsumeInput() doesn't tell you whether it actually consumed input

- We kind of document PQconsumeInput() to consume all the available input, but
  it does no such thing, it just fills the internal buffer. Which might be
  big, or it might not be, depending on what previously happened on the
  connection.

- PQisBusy() returns true even if there's unconsumed data in the socket

There's really no sane way to write an event loop with this that doesn't do
unnecessary poll()s, because PQconsumeInput() might have consumed data, but
PQisBusy() might *still* return true, because not enough input has been
consumed.

I suspect that the the most minimal way to fix this, without breaking the
external API, would be for PQisBusy() to try to read more data. But that's
also pretty gross.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Sami Imseih
Date: 18 July, 22:37:09
Subject: Re: track generic and custom plans in pg_stat_statements

From: Pavel Stehule
Date: 18 July, 22:57:12
Subject: Re: Documenting inlining SQL functions

Re: libpq: Process buffered SSL read bytes to support records >8kB on async API - Mailing list pgsql-hackers

Previous

Next