Re: Flush some statistics within running transactions - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Flush some statistics within running transactions
Date
Msg-id aWojE18aF9qj0EPI@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Flush some statistics within running transactions  (Sami Imseih <samimseih@gmail.com>)
Responses Re: Flush some statistics within running transactions
List pgsql-hackers
Hi,

On Thu, Jan 15, 2026 at 11:25:18AM -0600, Sami Imseih wrote:
> > > > The 1 second flush interval is currently hardcoded but we could imagine increase
> > > > it or make it configurable.
> > >
> > > Someone may want to turn this off as well. I think a GUC will be needed.
> >
> > I gave this more thoughts and I wonder if this should be configurable at all.
> > I mean, we don't do it for PGSTAT_MIN_INTERVAL, PGSTAT_MAX_INTERVAL and
> > PGSTAT_IDLE_INTERVAL. We could imagine make it configurable if it produces
> > noticeable performance impact but that's not what I observed.
> 
> Is there a reason we need a new constant (PGSTAT_ANYTIME_FLUSH_INTERVAL)
> for anytime flushes and can't rely on the existing PGSTAT_MIN_INTERVAL?

It currently gives flexibility for testing. If we agree that 1s is the right value
to set and that it should not be configurable then yeah we could replace it with
PGSTAT_MIN_INTERVAL then.

> Also, How did you benchmark? I am less concerned about long running
> transactions,
> background processes and more about short/high concurrency transactions seeing
> additional overhead due to additional flushing. Is that latter a concern?

I ran 3 kinds of tests:

1/
pgbench -c 32 -j 4 -T 60 -f short.sql -n -r $DB

with short.sql:

\set t1 random(1, 100)
\set t2 random(1, 100)
\set t3 random(1, 100)
\set t4 random(1, 100)
\set t5 random(1, 100)
\set t6 random(1, 100)
\set t7 random(1, 100)
\set t8 random(1, 100)
\set t9 random(1, 100)
\set t10 random(1, 100)
\set row random(1, 1000)

BEGIN;
UPDATE t:t1 SET val = val + 1 WHERE id = :row;
UPDATE t:t2 SET val = val + 1 WHERE id = :row;
UPDATE t:t3 SET val = val + 1 WHERE id = :row;
UPDATE t:t4 SET val = val + 1 WHERE id = :row;
UPDATE t:t5 SET val = val + 1 WHERE id = :row;
UPDATE t:t6 SET val = val + 1 WHERE id = :row;
UPDATE t:t7 SET val = val + 1 WHERE id = :row;
UPDATE t:t8 SET val = val + 1 WHERE id = :row;
UPDATE t:t9 SET val = val + 1 WHERE id = :row;
UPDATE t:t10 SET val = val + 1 WHERE id = :row;
COMMIT;

2/
psql $DB -f long.sql

with long.sql:

DO $$
BEGIN
  FOR i IN 1..100 LOOP
    EXECUTE format('TRUNCATE TABLE t%s', i);
    EXECUTE format('INSERT INTO t%s SELECT generate_series(1, 1000000)', i);
    EXECUTE format('UPDATE t%s SET val = val + 1', i);
    EXECUTE format('SELECT COUNT(1) FROM t%s', i);
  END LOOP;
END $$;

3/
pgbench -i -s 50 $DB
pgbench -c 32 -j 4 -T 60 -N -n -r $DB

I don't think this feature could add a noticeable performance impact, so the tests
have been that simple. Do you think we should worry more?

> > > I’m concerned that fields being temporarily out of sync might impact monitoring
> > > calculations, if the formula is dealing with fields that have
> > > different flush strategies.
> >
> > That's a good point. Maybe we should document the fields flush strategy?
> 
> Yeah, we will need to document this.

Will do in the next version.

> I checked by running seq scans in a long running transaction,
> and I observed both for these values being updated at the same time. I think
> this is OK.
> 

I do think the same.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Nazir Bilal Yavuz
Date:
Subject: Re: meson vs. llvm bitcode files
Next
From: Andres Freund
Date:
Subject: Re: Buffer locking is special (hints, checksums, AIO writes)