Re: Opinion poll: Sending an automated email to a thread when it gets added to the commitfest - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Opinion poll: Sending an automated email to a thread when it gets added to the commitfest
Date
Msg-id CAEze2Wi8hk2FkXg=CA_ZArpFDVaTs5BBG0FdoxCd8R3BeTAiAg@mail.gmail.com
Whole thread Raw
In response to Re: Opinion poll: Sending an automated email to a thread when it gets added to the commitfest  (Jelte Fennema-Nio <postgres@jeltef.nl>)
Responses Re: Opinion poll: Sending an automated email to a thread when it gets added to the commitfest
List pgsql-hackers
(sorry for the formatting, my mobile phone doesn't have the capabilities I usually get when using my laptop)

On Thu, 15 Aug 2024, 16:02 Jelte Fennema-Nio, <postgres@jeltef.nl> wrote:
On Thu, 15 Aug 2024 at 15:33, Peter Eisentraut <peter@eisentraut.org> wrote:
> Maybe this kind of thing should rather be on the linked-to web page, not
> in every email.

Yeah, I'll first put a code snippet on the page for the commitfest entry.

> But a more serious concern here is that the patches created by the cfbot
> are not canonical.  There are various heuristics when they get applied.
> I would prefer that people work with the actual patches sent by email,
> at least unless they know exactly what they are doing.  We don't want to
> create parallel worlds of patches that are like 90% similar but not
> really identical.

I'm not really sure what kind of heuristics and resulting differences
you're worried about here. The heuristics it uses are very simple and
are good enough for our CI. Basically they are:
1. Unzip/untar based on file extension
2. Apply patches using "patch" in alphabetic order

Also, when I apply patches myself, I use heuristics too. And my
heuristics are probably different from yours. So I'd expect that many
people using the exact same heuristic would only make the situation
better. Especially because if people don't know exactly what they are
doing, then their heuristics are probably not as good as the one of
our cfbot. I know I've struggled a lot the first few times when I was
manually applying patches.

One serious issue with this is that in cases of apply failures, CFBot delays, or other issues, the CFBot repo won't contain the latest version of the series' patchsets. E.g. a hacker can  accidentally send an incremental patch, or an unrelated patch to fix an issue mentioned in the thread without splitting into a different thread, etc. This can easily cause users (and CFBot) to test and review the wrong patch, esp. when the mail thread proper is not looked by the reviewer, which would be somewhat promoted by a CFA+github -centric workflow.

Apart from the above issue, I'm -0.5 on what to me equates with automated spam to -hackers: the volume of mails would put this around the 16th most common sender on -hackers, with about 400 mails/year (based on 80 new patches for next CF, and 5 CFs/year, combined with Robert's 2023 statistics at [0]).

I also don't quite like the suggested contents of such mail: (1) and (2) are essentially duplicative information, and because CF's entries' IDs are not shown in the app the "with ID 0000" part of (1) is practically useless (better use the CFE's title), (3) would best be stored and/or integrated in the CFA, as would (4). Additionally, (4) isn't canonical/guaranteed to be up-to-date, see above. As for the "copy-pastable git commands" suggestion, I'm not sure that's applicable, for the same reasons that (4) won't work reliably. CFBot's repo to me seems more like an internal implementation detail of CFBot than an authorative source of patchset diffs.


Maybe we could instead automate CF mail thread registration by allowing registration of threadless CF entries (as 'preliminary'), and detecting (and subsequently linking) new threads containing references to those CF entries, with e.g. an  "CF: https://commitfest.postgresql.org/49/4980/" directive in the new thread's initial mail's text. This would give the benefits of requiring no second mail for CF referencing purposes, be it automated or manual. 
Alternatively, we could allow threads for new entries to be started through the CF app (which would automatically insert the right form data into the mail), providing an alternative avenue to registering patches that doesn't have the chicken-and-egg problem you're trying to address here.


Kind regards,

Matthias van de Meent

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: generic plans and "initial" pruning
Next
From: Rafia Sabih
Date:
Subject: Re: Reducing the log spam