Thread: Experimental tool to explore commitfest patches

Experimental tool to explore commitfest patches

From
Jacob Brazeal
Date:
Hi all, 

I have created an experimental tool [0] to help explore the vast depths of the upcoming commitfest, and it's designed to help each contributor find actually useful and relevant patches to review. Please have a look!

Under the hood, it does two things:

1. Use a good LLM [1] to analyze all the mailing threads tied to the commitfest. This gives us a summary of the thread, a summary of the main blocker, if any, and a gut-check on whether we actually need a new reviewer in the thread. It also gives a first-principles read on the actual status: are we waiting for the author to make changes, for a reviewer to respond, etc. 
2. Cross-reference the files in the patches to the personal commit history of everyone in the postgres project. In this way, using a variant of the classic TF-IDF algorithm [2], we can score how close the patch lies to each contributor's usual territory. It's only a heuristic, but seemed well worth trying.

The data pipeline for this is run on my personal laptop at the moment. I've just refreshed everything but, of course, the various statuses and analyses need to be re-run reasonably often to remain useful. It only costs a dollar or two to run everything through the LLM, and I can probably optimize what really needs to be processed, but this is worth considering if there is broader interest. If we like this, I'm happy to help port a version of it over to the commitfest app.

Here is the source code for the whole app [3]. 

Regards,
Jacob Brazeal


Re: Experimental tool to explore commitfest patches

From
Akshat Jaimini
Date:
Hi Jacob,

Thanks a lot for this! We have been trying to come up with a similar feature for the new commitfest app [0].

> If we like this, I'm happy to help port a version of it over to the commitfest app.

I would love to help you out in porting this to the commitfest app.

Regards,
Akshat Jaimini


On Mon, Feb 24, 2025 at 9:01 AM Jacob Brazeal <jacob.brazeal@gmail.com> wrote:
Hi all, 

I have created an experimental tool [0] to help explore the vast depths of the upcoming commitfest, and it's designed to help each contributor find actually useful and relevant patches to review. Please have a look!

Under the hood, it does two things:

1. Use a good LLM [1] to analyze all the mailing threads tied to the commitfest. This gives us a summary of the thread, a summary of the main blocker, if any, and a gut-check on whether we actually need a new reviewer in the thread. It also gives a first-principles read on the actual status: are we waiting for the author to make changes, for a reviewer to respond, etc. 
2. Cross-reference the files in the patches to the personal commit history of everyone in the postgres project. In this way, using a variant of the classic TF-IDF algorithm [2], we can score how close the patch lies to each contributor's usual territory. It's only a heuristic, but seemed well worth trying.

The data pipeline for this is run on my personal laptop at the moment. I've just refreshed everything but, of course, the various statuses and analyses need to be re-run reasonably often to remain useful. It only costs a dollar or two to run everything through the LLM, and I can probably optimize what really needs to be processed, but this is worth considering if there is broader interest. If we like this, I'm happy to help port a version of it over to the commitfest app.

Here is the source code for the whole app [3]. 

Regards,
Jacob Brazeal


Re: Experimental tool to explore commitfest patches

From
Jacob Brazeal
Date:
> We have been trying to come up with a similar feature for the new commitfest app
> I would love to help you out in porting this to the commitfest app.

Great! There is plenty of work to do, there is some discussion in the discord [0].

I think it would also be nice to set up a proper data pipeline using something like Dagster [1] or Airflow [2] to manage the scraping, LLM calls, postprocessing, etc.

Re: Experimental tool to explore commitfest patches

From
Robert Haas
Date:
On Sun, Feb 23, 2025 at 10:31 PM Jacob Brazeal <jacob.brazeal@gmail.com> wrote:
> I have created an experimental tool [0] to help explore the vast depths of the upcoming commitfest, and it's designed
tohelp each contributor find actually useful and relevant patches to review. Please have a look! 

As I also mentioned in the Discord, I really like the auto-generated
summaries. No doubt they are not perfect, but they seem pretty useful
on first look. The patch ranking seems a bit odd, though -- it thinks
I should be super-interested in postgres_fdw patches. So that part
might need some more work.

--
Robert Haas
EDB: http://www.enterprisedb.com