Home > mailing lists

Re: Support for NSS as a libpq TLS backend - Mailing list pgsql-hackers

From	Jacob Champion
Subject	Re: Support for NSS as a libpq TLS backend
Date	November 12, 2020 22:12:42
Msg-id	FBA0C365-3B19-4E40-98F4-8B0064DB7127@vmware.com Whole thread Raw
In response to	Re: Support for NSS as a libpq TLS backend (Jacob Champion <pchampion@vmware.com>)
Responses	Re: Support for NSS as a libpq TLS backend
List	pgsql-hackers

Tree view

On Nov 11, 2020, at 10:57 AM, Jacob Champion <pchampion@vmware.com> wrote:
>
> False alarm -- the stderr debugging I'd added in to track down the
> assertion tripped up the "no stderr" tests. Zero failing tests now.

I took a look at the OpenSSL interop problems you mentioned upthread. I
don't see a hang like you did, but I do see a PR_IO_TIMEOUT_ERROR during
connection.

I think pgtls_read() needs to treat PR_IO_TIMEOUT_ERROR as if no bytes
were read, in order to satisfy its API. There was some discussion on
this upthread:

On Oct 27, 2020, at 1:07 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
>
> On 20 Oct 2020, at 21:15, Andres Freund <andres@anarazel.de> wrote:
>>
>>> +            case PR_IO_TIMEOUT_ERROR:
>>> +                break;
>>
>> What does this mean? We'll return with a 0 errno here, right? When is
>> this case reachable?
>
> It should, AFAICT, only be reachable when PR_Recv is used with a timeout which
> we don't do.  It mentioned somewhere that it had happened in no-wait calls due
> to a bug, but I fail to find that reference now.  Either way, I've removed it
> to fall into the default error handling which now sets errno correctly as that
> was a paddle short here.

PR_IO_TIMEOUT_ERROR is definitely returned in no-wait calls on my
machine. It doesn't look like the PR_Recv() API has a choice -- if
there's no data, it can't return a positive integer, and returning zero
means that the socket has been disconnected. So -1 with a timeout error
is the only option.

I'm not completely sure why this is exposed so easily with an OpenSSL
server -- I'm guessing the implementation slices up its packets
differently on the wire, causing a read event before NSS is able to
decrypt a full record -- but it's worth noting that this case also shows
up during NSS-to-NSS psql connections, when handling notifications at
the end of every query. PQconsumeInput() reports a hard failure with the
current implementation, but its return value is ignored by
PrintNotifications(). Otherwise this probably would have showed up
earlier.

(What's the best way to test this case? Are there lower-level tests for
the protocol/network layer somewhere that I'm missing?)

While patching this case, I also noticed that pgtls_read() doesn't call
SOCK_ERRNO_SET() for the disconnection case. That is also in the
attached patch.

--Jacob

Attachment

nss-handle-timeouts-and-disconnections-in-pgtls_read.patch

pgsql-hackers by date:

From: Peter Geoghegan
Date: 12 November 2020, 22:00:07
Subject: Re: Deleting older versions in unique indexes to avoid page splits

From: Tom Lane
Date: 12 November 2020, 22:16:33
Subject: Re: Bogus documentation for bogus geometric operators

Re: Support for NSS as a libpq TLS backend - Mailing list pgsql-hackers

Attachment

Previous

Next