Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array - Mailing list pgsql-hackers

From Xuneng Zhou
Subject Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array
Date
Msg-id CABPTF7UxJ01R9NGFMX_8QBbtrEOJ_rT1SSF+P_DgYJsp9Gc5jw@mail.gmail.com
Whole thread Raw
In response to Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array  (Kirill Reshke <reshkekirill@gmail.com>)
Responses Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array
List pgsql-hackers
Hi,

On Mon, Oct 20, 2025 at 11:36 AM Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> On Mon, 20 Oct 2025 at 08:08, Xuneng Zhou <xunengzhou@gmail.com> wrote:
> >
> > Hi, thanks for looking into this.
> >
> > On Sat, Oct 18, 2025 at 4:59 PM Kirill Reshke <reshkekirill@gmail.com> wrote:
> > >
> > > On Sat, 18 Oct 2025 at 12:50, Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > > >
> > > > Hi Hackers,
> > >
> > > Hi!
> > >
> > > > The SnapBuildPurgeOlderTxn function previously used a suboptimal
> > > > method to remove old XIDs from the committed.xip array. It allocated a
> > > > temporary workspace array, copied the surviving elements into it, and
> > > > then copied them back, incurring unnecessary memory allocation and
> > > > multiple data copies.
> > > >
> > > > This patch refactors the logic to use a standard two-pointer, in-place
> > > > compaction algorithm. The new approach filters the array in a single
> > > > pass with no extra memory allocation, improving both CPU and memory
> > > > efficiency.
> > > >
> > > > No behavioral changes are expected. This resolves a TODO comment
> > > > expecting a more efficient algorithm.
> > > >
> > >
> > > Indeed, these changes look correct.
> > > I wonder why b89e151054a0 did this place this way, hope we do not miss
> > > anything here.
> >
> > I think this small refactor does not introduce behavioral changes or
> > breaks given constraints.
> >
> > > Can we construct a microbenchmark here which will show some benefit?
> > >
> >
> > I prepared a simple microbenchmark to evaluate the impact of the
> > algorithm replacement. The attached results summarize the findings.
> > An end-to-end benchmark was not included, as this function is unlikely
> > to be a performance hotspot in typical decoding workloads—the array
> > being cleaned is expected to be relatively small under normal
> > operating conditions. However, its impact could become more noticeable
> > in scenarios with long-running transactions and a large number of
> > catalog-modifying DML or DDL operations.
> >
> > Hardware:
> > AMD EPYC™ Genoa 9454P 48-core 4th generation
> > DDR5 ECC reg
> > NVMe SSD Datacenter Edition (Gen 4)
> >
> > Best,
> > Xuneng
>
> At first glance these results look satisfactory.
>
> Can you please describe, how did you get your numbers? Maybe more
> script or steps to reproduce, if anyone will be willing to...
>

Sure. Here is a brief description of this experiential benchmark:

1)  what Tier 1 measures

Function under test: committed.xip purge in SnapBuild
(OLD=workspace+memcpy vs NEW=in-place compaction).

Inputs:
Array sizes: 100, 500, 1000, 2000, 5000, 10000
Keep ratios: 0.9, 0.5, 0.1, 0.01
Distributions: scattered (Fisher–Yates shuffle), contiguous
Repetitions: 30 per scenario

RNG and determinism: pg_prng_state with seed 42 per dataset ensures
reproducibility.

Metrics recorded per scenario:
Time (ns): mean, median, p95
Survivors (count)
Memory traffic (bytes): bytes_read, bytes_written, bytes_total
OLD: reads = (xcnt + survivors) × sizeof(XID); writes = 2 × survivors
× sizeof(XID)
NEW: reads = xcnt × sizeof(XID); writes = survivors × sizeof(XID)

2) The core components

#  C Extension (snapbuild_bench.c) - The actual benchmark implementation

The C extension contains the actual benchmark implementation that runs
inside the PostgreSQL backend process. It's designed to:
- Mimic real PostgreSQL code paths** as closely as possible
- Use actual PostgreSQL data structures** (`TransactionId`, `MemoryContext`)
- Call real PostgreSQL functions** (`NormalTransactionIdPrecedes`)
- Measure with nanosecond precision** using PostgreSQL's timing infrastructure

#  SQL Wrapper (snapbuild_bench--1.0.sql) - function definitions

#  Orchestration Scripts - Automated benchmark execution and analysis
run_snapbuild_purge_bench

3) Execution Flow

1. Extension Installation
# Build and install
export PG_CONFIG=$HOME/pg/vanilla/bin/pg_config
make -C contrib_extension/snapbuild_bench clean install
# Create extension in database
CREATE EXTENSION snapbuild_bench;

3. Run full benchmark suite
./run_snapbuild_purge_bench.sh --clean --with-baseline <patch>

4. Data Analysis
# Generate plots
python3 plot_tier1_results.py --csv results/unit/base_unit.csv --out plots/
# Compare baseline vs patched
python3 compare_snapbuild_results.py vanilla/ patched/

TBH, the performance improvement from this refactor is fairly
straightforward, and it’s unlikely to introduce regressions. The
experimental benchmark is therefore more complex than necessary.
Still, I treated it as a learning exercise — an opportunity to
practice benchmarking methodology and hopefully to reuse some of these
techniques when evaluating more performance-critical paths in the
future. If anyone has suggestions or spots issues, I’d greatly
appreciate your feedback as well.

Best,
Xuneng

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Use CompactAttribute more often, when possible
Next
From: David Rowley
Date:
Subject: Re: Use CompactAttribute more often, when possible