Thread: pending_since assertion failure on skink

pending_since assertion failure on skink

From
Andres Freund
Date:
Hi,


https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=skink&dt=2025-04-11%2007%3A41%3A36&stg=subscription-check
shows a failure on the publisher:

  2025-04-11 11:01:06.175 UTC [2975477][walsender][29/0:0] STATEMENT:  START_REPLICATION SLOT
"pg_16410_sync_16392_7491957737045826723"LOGICAL 0/17DEB30 (proto_version '4', streaming 'parallel', origin 'any',
publication_names'"tap_pub_x","tap_pub_allinschema"')
 
  TRAP: failed Assert("pending_since == 0"), File: "../pgsql/src/backend/utils/activity/pgstat.c", Line: 734, PID:
2975477
  postgres: publisher: walsender bf postgres [local] START_REPLICATION(ExceptionalCondition+0x5f) [0x6fb855]
  postgres: publisher: walsender bf postgres [local] START_REPLICATION(pgstat_report_stat+0x119) [0x5cad06]
  postgres: publisher: walsender bf postgres [local] START_REPLICATION(PostgresMain+0x80f) [0x5b1e84]
  postgres: publisher: walsender bf postgres [local] START_REPLICATION(BackendMain+0x59) [0x5aacfe]
  postgres: publisher: walsender bf postgres [local] START_REPLICATION(postmaster_child_launch+0x117) [0x4f8e27]
  postgres: publisher: walsender bf postgres [local] START_REPLICATION(+0x3f3eec) [0x4fbeec]
  postgres: publisher: walsender bf postgres [local] START_REPLICATION(+0x3f5c6b) [0x4fdc6b]
  postgres: publisher: walsender bf postgres [local] START_REPLICATION(PostmasterMain+0x1190) [0x4ff26c]
  postgres: publisher: walsender bf postgres [local] START_REPLICATION(main+0x1e2) [0x42afe2]
  /lib/x86_64-linux-gnu/libc.so.6(+0x29ca8) [0x5b3bca8]
  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x5b3bd65]
  postgres: publisher: walsender bf postgres [local] START_REPLICATION(_start+0x21) [0x1ee761]
  2025-04-11 11:01:06.574 UTC [2971921][postmaster][:0] LOG:  client backend (PID 2975477) was terminated by signal 6:
Aborted
  2025-04-11 11:01:06.574 UTC [2971921][postmaster][:0] DETAIL:  Failed process was running: START_REPLICATION SLOT
"pg_16410_sync_16392_7491957737045826723"LOGICAL 0/17DEB30 (proto_version '4', streaming 'parallel', origin 'any',
publication_names'"tap_pub_x","tap_pub_allinschema"')
 
  2025-04-11 11:01:06.577 UTC [2971921][postmaster][:0] LOG:  terminating any other active server processes

My understanding of what this indicates is that we we set pending_stats =
true, but then somehow discovered that we don't actually have pending stats.

I suspect that this is related to

commit 039549d70f6aa2daa3714a13752a08fa8ca2fb05
Author: Michael Paquier <michael@paquier.xyz>
Date:   2025-04-08 07:57:19 +0900

    Flush the IO statistics of active WAL senders more frequently

Which afaict could lead to having pending stats as known to
pgstat_report_stat() which then could be flushed "behind
pgstat_report_stat()'s back" due to the new pgstat_flush_io() and
pgstat_flush_backend() calls.

We might just have to give up on that assertion, I guess?

Greetings,

Andres Freund



Re: pending_since assertion failure on skink

From
Bertrand Drouvot
Date:
Hi,

On Sat, Apr 12, 2025 at 10:33:33AM -0400, Andres Freund wrote:
> I suspect that this is related to
> 
> commit 039549d70f6aa2daa3714a13752a08fa8ca2fb05

+1

> We might just have to give up on that assertion, I guess?

Probably. I'll have a look at it next week.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com