Re: libpq: Process buffered SSL read bytes to support records >8kB on async API - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: libpq: Process buffered SSL read bytes to support records >8kB on async API |
Date | |
Msg-id | ghryfzhndzmr5g4fzhwiupvpyyw4mnuq6kiqte6jvmqduippzf@2lnipo7wkort Whole thread Raw |
In response to | Re: libpq: Process buffered SSL read bytes to support records >8kB on async API (Jacob Champion <jacob.champion@enterprisedb.com>) |
Responses |
Re: libpq: Process buffered SSL read bytes to support records >8kB on async API
|
List | pgsql-hackers |
Hi, On 2025-07-17 09:48:29 -0700, Jacob Champion wrote: > On Wed, Jul 16, 2025 at 4:35 PM Andres Freund <andres@anarazel.de> wrote: > > Why do we care about not hitting the socket? We always operate the socket in > > non-blocking mode anyway? > > IIUC, that would change pqReadData() from a bounded read to an > unbounded one. Then we have to somehow protect against a server that > can send data faster than we can process it. I don't see why it would have to. If a connection has sufficient data in its buffer on the libpq level, we obviously don't need to read more. > > FWIW, I have seen too many folks move away from encrypted connections for > > backups due to openssl being the bottlenecks to just accept that we'll never > > go beyond the worst performance. Any actually decently performing networking > > will need to divorce when socket IO happens and when openssl sees the data > > much more heavily. > > Can't we make our custom BIO do that? We can, but that actually just makes this issue *more* acute, not less. We would obviously have to take the data buffered on the BIO level into account before signalling to the libpq user that no data is ready. FWIW, even with BIO level buffering, it turns out that openssl readahead is *still* a decent performance boon, going twice as many times in the BIO isn't free, even if the BIO does buffering. > > Just relying more and more on extremely tight coupling is a dead end. > > Why? This abstraction _must_ leak. I don't really buy this. We have buffering in libpq, even without TLS. Most of the issues we need to solve already exists for non-TLS connections (except for the annoying thing of introduces writes when doing reads and vice versa). It's just that our implementation is bad. > TLS does not behave like raw TCP. So if there's no way to get rid of > the coupling, IMO we might as well use it. I agree that they're not the same. I don't see what that has to do with introducing hard requirement for no buffering within openssl or layers below (a future BIO that does caching). > > > IIUC, we can't use SSL_has_pending() either. I need to know how > > > exactly many bytes to drain out of the userspace buffer without > > > introducing more bytes into it. > > > > Why? The only case where we ought to care about whether pending data exists > > inside openssl is if our internal buffer is either empty or doesn't contain > > the entire message we're trying to consume. In either of those two cases we > > can just consume some data from openssl, without caring how much it precisely > > is. > > I can't tell if you're discussing the fix for this bug or a hypothetical > future architecture that makes readahead safe. I don't know your fix really looks like - afaict you haven't shared it. So it's hard to actually comment with specifics to it. I am not saying that a to-be-backpatched-fix needs to make openssl readahead work, that'd be absurd. But I am concerned with more fundamentally entrenching the idea that there never is any buffering below libpq's buffering, it'll cause trouble down the line. > I don't think PQconnectPoll() behaves the way you're describing; there is no > retry-read concept. FWIW, I don't care about what we do during connection establishment. Doing lots of tiny reads or such will never matter for efficiency, compared to all the other costs. To me the relevant cases are use of libpq in established connections, either in a blocking way, or in a non-blocking way. For blocking use we have that gross SSL hack in place (i.e. the pgtls_read_pending() check in pqSocketCheck()), but we obviously *don't* for the non-blocking case - the problem of this thread. Our APIs around the non-blocking use of libpq are pretty bonkers: - PQconsumeInput() doesn't tell you whether it actually consumed input - We kind of document PQconsumeInput() to consume all the available input, but it does no such thing, it just fills the internal buffer. Which might be big, or it might not be, depending on what previously happened on the connection. - PQisBusy() returns true even if there's unconsumed data in the socket There's really no sane way to write an event loop with this that doesn't do unnecessary poll()s, because PQconsumeInput() might have consumed data, but PQisBusy() might *still* return true, because not enough input has been consumed. I suspect that the the most minimal way to fix this, without breaking the external API, would be for PQisBusy() to try to read more data. But that's also pretty gross. Greetings, Andres Freund
pgsql-hackers by date: