Re: Experimental tool to explore commitfest patches - Mailing list pgsql-hackers

From Akshat Jaimini
Subject Re: Experimental tool to explore commitfest patches
Date
Msg-id CAO8Bkb6-Rx+eqZyTLwOOQpos43CMJiem0E_NU1xYgc6Nid8aSw@mail.gmail.com
Whole thread Raw
In response to Experimental tool to explore commitfest patches  (Jacob Brazeal <jacob.brazeal@gmail.com>)
Responses Re: Experimental tool to explore commitfest patches
List pgsql-hackers
Hi Jacob,

Thanks a lot for this! We have been trying to come up with a similar feature for the new commitfest app [0].

> If we like this, I'm happy to help port a version of it over to the commitfest app.

I would love to help you out in porting this to the commitfest app.

Regards,
Akshat Jaimini


On Mon, Feb 24, 2025 at 9:01 AM Jacob Brazeal <jacob.brazeal@gmail.com> wrote:
Hi all, 

I have created an experimental tool [0] to help explore the vast depths of the upcoming commitfest, and it's designed to help each contributor find actually useful and relevant patches to review. Please have a look!

Under the hood, it does two things:

1. Use a good LLM [1] to analyze all the mailing threads tied to the commitfest. This gives us a summary of the thread, a summary of the main blocker, if any, and a gut-check on whether we actually need a new reviewer in the thread. It also gives a first-principles read on the actual status: are we waiting for the author to make changes, for a reviewer to respond, etc. 
2. Cross-reference the files in the patches to the personal commit history of everyone in the postgres project. In this way, using a variant of the classic TF-IDF algorithm [2], we can score how close the patch lies to each contributor's usual territory. It's only a heuristic, but seemed well worth trying.

The data pipeline for this is run on my personal laptop at the moment. I've just refreshed everything but, of course, the various statuses and analyses need to be re-run reasonably often to remain useful. It only costs a dollar or two to run everything through the LLM, and I can probably optimize what really needs to be processed, but this is worth considering if there is broader interest. If we like this, I'm happy to help port a version of it over to the commitfest app.

Here is the source code for the whole app [3]. 

Regards,
Jacob Brazeal


pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: Parallel heap vacuum
Next
From: Tom Lane
Date:
Subject: Re: psql \dh: List High-Level (Root) Tables and Indexes