Re: replication primary writting infinite number of WAL files - Mailing list pgsql-general

From Laurenz Albe
Subject Re: replication primary writting infinite number of WAL files
Date
Msg-id 6097c54582a91b40eaab2d3d24152d98fcaf998f.camel@cybertec.at
Whole thread Raw
In response to Re: replication primary writting infinite number of WAL files  (Les <nagylzs@gmail.com>)
Responses Re: replication primary writting infinite number of WAL files
List pgsql-general
On Fri, 2023-11-24 at 16:59 +0100, Les wrote:
>
>
> Laurenz Albe <laurenz.albe@cybertec.at>  (2023. nov. 24., P, 16:00):
> > On Fri, 2023-11-24 at 12:39 +0100, Les wrote:
> > > Under normal circumstances, the number of write operations is relatively low, with an
> > > average of 4-5 MB/sec total write speed on the disk associated with the data directory.
> > > Yesterday, the primary server suddenly started writing to the pg_wal directory at a
> > > crazy pace, 1.5GB/sec, but sometimes it went up to over 3GB/sec.
> > > [...]
> > > Upon further analysis of the database, we found that we did not see any mass data
> > > changes in any of the tables. The only exception is a sequence value that was moved
> > > millions of steps within a single minute.
> >
> > That looks like some application went crazy and inserted millions of rows, but the
> > inserts were rolled back.  But it is hard to be certain with the clues given.
>
> Writing of WAL files continued after we shut down all clients, and restarted the primary PostgreSQL server.
>
> How can the primary server generate more and more WAL files (writes) after all clients have
> been shut down and the server was restarted? My only bet was the autovacuum. But I ruled
> that out, because removing a replication slot has no effect on the autovacuum (am I wrong?).

It must have been autovacuum.  Removing a replication slot has an influence, since then
autovacuum can do more work.  If the problem stopped when you dropped the replication slot,
it could be a coincidence.

> Now you are saying that this looks like a huge rollback.

It could have been many small rollbacks.

> Does rolling back changes require even more data to be written to the WAL after server
> restart?

No.  My assumption would be that something generated lots of INSERTs that were all
rolled back.  That creates WAL, even though you see no change in the table data.


> Does removing a replication slot lessen the amount of data needed to be written for
> a rollback (or for anything else)?

No: the WAL is generated by whatever precedes the ROLLBACK, and the ROLLBACK does
not create a lot of WAL.

> It is a fact that the primary stopped writing at 1.5GB/sec the moment we removed the slot.

I have no explanation for that, except a coincidence.
Replication slots don't generate WAL.

Yours,
Laurenz Albe



pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Inquiry Regarding Initial Seed for pgsql Protocol Fuzz Testing
Next
From: Adrian Klaver
Date:
Subject: Re: replication primary writting infinite number of WAL files