pg_running_stats - mergeable running statistics (Welford/Chan) extension for postgresql - Mailing list pgsql-hackers

From Chanukya SDS
Subject pg_running_stats - mergeable running statistics (Welford/Chan) extension for postgresql
Date
Msg-id CAB4f4B6ga-bjBAmWu2FEjXiR539M6zV2J4EOyOoO5EtPGaTFKQ@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hi all,

I’d like to share a new PostgreSQL extension called pg_running_stats.

It implements mergeable, numerically stable running statistics using the Welford and Chan algorithms.  
Unlike the built-in aggregates such as avg(), variance(), and stddev(), which require scanning the entire dataset, pg_running_stats maintains a compact internal state that can be updated or merged incrementally.

This makes it well-suited for:
  1. streaming or real-time analytics where data arrives continuously,
  2. incremental computation over large tables,
  3. parallel or distributed queries that need to merge partial aggregates efficiently.

The extension computes:
mean, variance, standard deviation, skewness, kurtosis, and min/max all in a single pass.

It’s written entirely in C, depends only on PostgreSQL headers, and builds cleanly on macOS (Homebrew) and Linux using PGXS.

Source and documentation:
https://github.com/chanukyasds/pg_running_stats

Any feedback, testing, or suggestions for improvement would be very welcome.

Thanks,
Chanukya

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Improved TAP tests by replacing sub-optimal uses of ok() with better Test::More functions
Next
From: Chao Li
Date:
Subject: Re: Fix an unnecessary cast calling elog in ExecHashJoinImpl