Re: [PoC] Federated Authn/z with OAUTHBEARER - Mailing list pgsql-hackers

From Jacob Champion
Subject Re: [PoC] Federated Authn/z with OAUTHBEARER
Date
Msg-id CAOYmi+mf+hw7x+E3XkRypEGy=yR3n6zowjjXeXaSK20WxX3UGw@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Federated Authn/z with OAUTHBEARER  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Fri, Feb 28, 2025 at 5:44 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> If you trigger the new optional NetBSD CI task, the oauthvalidator
> tests implode[1].

Oh, thank you for reporting that. I need to pay more attention to the
BSD CI thread.

> Apparently that OS's kevent() doesn't like zero
> relative timeouts for EVFILT_TIMER[2].  I see that you worked around
> the same problem for Linux timerfd already by rounding 0 up to 1, so
> we could just do the same here, and it passes with the attached.

Just from inspection, that looks good to me. I'll look into running
the new BSD tasks on the other patches I posted above, too.

Should we maybe consider just doing that across the board, and put up
with the inefficiency? Admittedly 1ms is a lot more dead time than
1ns...

> A
> cute alternative, not tested, might be to put NOTE_ABSTIME into fflag
> if timeout == 0 (then it's an absolute time in the past, which should
> fire immediately).

That could work. I think I need to stare at the man pages more if we
go that direction, since (IIUC) NOTE_ABSTIME changes up some other
default behavior for the timer.

> But I'm curious, how hard would it be to do this ↓ instead and not
> have that problem on any OS?
>
>      * There might be an optimization opportunity here: if timeout == 0, we
>      * could signal drive_request to immediately call
>      * curl_multi_socket_action, rather than returning all the way up the
>      * stack only to come right back. But it's not clear that the additional
>      * code complexity is worth it.

I'm not sure if it's hard, so much as it is confusing. My first
attempt at it a while back gave me the feeling that I wouldn't
remember how it worked in a few months.

Here are the things that I think we would have to consider, at minimum:
1. Every call to curl_multi_socket_action/all now has to look for the
new "time out immediately" flag after returning.
2. If it is set, and if actx->running has been set to zero, we have to
decide what that means conceptually. (Is that an impossible case? Or
do we ignore the timeout and assume the request is done/failed?)
3. Otherwise, we need to clear the flag and immediately call
curl_multi_socket_action(CURL_SOCKET_TIMEOUT), and repeat until that
flag is no longer set. That feels brittle to me, because if there's
some misunderstanding in our code or some strange corner case in an
old version of Curl on some platform, and we keep getting a timeout of
zero, we'll hit an infinite loop. (The current behavior instead
returns control to the top level every time, and gives
curl_multi_socket_all() a chance to right the ship by checking the
status of all the outstanding sockets.)
4. Even if 2 and 3 are just FUD, there's another potential call to
set_timer(0) in OAUTH_STEP_TOKEN_REQUEST. The interval can only be
zero in debug mode, so if we add new code for that case alone, the
tests will be mostly exercising a non-production code path.

I prefer your patch, personally.

Thanks!
--Jacob



pgsql-hackers by date:

Previous
From: Nazir Bilal Yavuz
Date:
Subject: Re: Adding NetBSD and OpenBSD to Postgres CI
Next
From: Matthias van de Meent
Date:
Subject: Re: Incorrect result of bitmap heap scan.