Re: More efficient build farm animal wakeup? - Mailing list pgsql-hackers
From | Magnus Hagander |
---|---|
Subject | Re: More efficient build farm animal wakeup? |
Date | |
Msg-id | CABUevEysQnc4UqPa--jOrUzD9YUabqvSdPHL371EBFmymqz_dw@mail.gmail.com Whole thread Raw |
In response to | Re: More efficient build farm animal wakeup? (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: More efficient build farm animal wakeup?
|
List | pgsql-hackers |
On Sun, Nov 20, 2022 at 4:56 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Sun, Nov 20, 2022 at 1:35 AM Magnus Hagander <magnus@hagander.net> wrote:
> tl,tr; it's not there now, but yes if we can find a smart way for th ebf clients to consume it, it is something we could build and deploy fairly easily.
Cool -- it sounds a lot like you've thought about this already :-)
About the client: currently run_branches.pl makes an HTTP request for
the "branches of interest" list. Seems like a candidate point for a
long poll? I don't think it'd have to be much smarter than it is
today, it'd just have to POST the commits it already has, I think.
Um, branches of interest will only pick up when it gets a new *branch*, not a new *commit*, so I think that would be a very different problem to solve. And I don't think we have new branche *that* often...
Perhaps as a first step, the server could immediately report which
branches to bother fetching, considering the client's existing
commits. That'd almost always be none, but ~11.7 times per day a new
commit shows up, and once a year there's a new interesting branch.
That would avoid the need for the 6 git fetches that usually follow in
the common case, which admittedly might not be a change worth making
on its own. After all, the git fetches are probably quite similar
HTTP requests themselves, except that there 6 of them, one per branch,
and they hit the public git server instead of some hypothetical
buildfarm endpoint.
As Andres mentioned downthread, that's not a lot more lightweight than what "git fetch" does.
The thing we'd want to avoid is having to do that so much and often. And getting to that is going to require modification of the buildfarm client to make it more "smart" regardless. In particular, making it do this "right" in the face of multiple branches is probably going to be a big win.
Then you could switch to long polling by letting the client say "if
currently none, I'm prepared to wait up to X seconds for a different
answer", assuming you know how to build the server side of that
(insert magic here). Of course, you can't make it too long or your
session might be dropped in the badlands between client and server,
but that's just a reason to make X configurable. I think RFC6202 says
that 120 seconds probably works fine across most kinds of links, which
means that you lower the total poll rate hitting the server, but--more
interestingly for me as a client--you minimise latency when something
finally happens. (With various keepalive tricks and/or heartbeat
streaming tricks you could possibly make it much higher, who knows...
but you'd have to set it very very low to do worse than what we're
doing today in total request count). Or maybe there is some existing
easy perl library that could be used for this (joke answer: cpan
install Twitter::API and follow @pg_commits).
I also honestly wonder how big a problem a much longer than 120 seconds timeout would be in practice. Since we own both the client and the server in this case, we'd only be at mercy of network equipment in between and I think we're much less exposed to weirdness there than "the average browser". Thus, as long as it's configurable, I think we could go for something much longer by default.
I'd imagine something like a
X-branch-master: a4adc31f69
X-branch-REL_14_STABLE: b33283cbd3
X-longpoll: 120
For that one it would check branch master and rel 14, and if either branchtip doesn't match what was in the header, it'd return immediately with a textfile that's basically
master:<whateveritis>
if master has changed and not REL_14.
If nothing has changed, go into longpoll for 120 seconds based on the header, and if nothing at all has changed in that time, return a 304.
We could also use something like a websocket to just stream the changes out over.
In either case it would also need to change the buildfarm client to run as a daemon rather than a cronjob I think? (obviously optional, we don't have to remove the current abilities)
However, when I started this thread I was half expecting such a thing
to exist already, somewhere, I just haven't been able to find it
myself... Don't other people have this problem? Maybe everybody who
has this problem uses webhooks (git server post commit hook opens
connection to client) as you mentioned, but as you also mentioned
that'd never fly for our topology.
Yeah, webhook seems to be what most people use.
FWIW, an implementation for us would be a small daemon that receives such webhooks from our git server and redistributtes it for the long polling. That's still the easiest way to get the data out of git itself...
//Magnus
pgsql-hackers by date: