Re: replication primary writting infinite number of WAL files - Mailing list pgsql-general

From Adrian Klaver
Subject Re: replication primary writting infinite number of WAL files
Date
Msg-id 592a4e9d-0d12-40ac-ad2c-0ddc4caa760a@aklaver.com
Whole thread Raw
In response to Re: replication primary writting infinite number of WAL files  (Les <nagylzs@gmail.com>)
Responses Re: replication primary writting infinite number of WAL files
List pgsql-general


On 11/24/23 1:05 PM, Les wrote:

>> Sequence is incremented by 100,  so for example, between 2023-11-24
>> 10:20:00 and 2023-11-24 10:21:00 it went up 62188671 steps. I think it
>> is not possible to insert 62188671 rows into a table. A psql function
>> might be able to increment a sequence 62M times / minute, I'm not sure.

Am I correct in assuming id has as it's default nextval(<the_sequence>)?
All primary keys are "id int8 NOT NULL". They don't have a default, but all of these identifiers are generated from the same (global) sequence. We are using a single global sequence, because there are only a few writes. 99% of I/O operations are read. 


A single sequence for all id columns across all tables?

How is the sequence value landing in the id column?


If so it would seem to me something was doing a lot of INSERTS between
2023-11-24 10:20:00.000 and 2023-11-24 10:21:00.000.

Yes. But is it possible to insert 62M rows within 1 minute?

Is has not been determined yet that 62 million inserts where done, just that the sequence jumped by that number of steps.



It might be possible to "select nextval" 1M times per second from an Pl/SQL function that does nothing else in an infinite loop. But that would not write too much data on the disk.

On an old laptop:

create table insert_test(id int);

insert into insert_test select val from generate_series(1, 1000000) as t(val);
INSERT 0 1000000
Time: 943.918 ms

I would  say there is more then just the id being inserted, unless all the other fields allow NULL.




 

And there is nothing in the logs in that time period besides "checkpoints are happening too frequently"?



The "set transaction" warning is due to a bug in an application that calls SET TRANSACTION before starting a transaction.

And what is that app doing when it does SET TRANSACTION?




https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT

log_statement

set to at least 'mod'?

Unfortunately we don't, but now that you asked, we will turn this on. 


>> We are using dev databases that are created from snapshots of the
>> standby. There is a possibility that a dev database instance (created
>> from a snapshot of the standby) might have connected the primary just
>> before it was reconfigured to be standalone. Can this be a problem?

Was your original report for the dev databases also?
No, every log and metric was for the prod db.


So how do the dev databases enter into this"


How are the snapshots being taken?
zfs snapshot is taken on the standby, then zfs clone is created on the snapshot, and a new postgresql instance is started on top of the clone. It recovers within one minute. In a very few cases (two or three times in a year), it fails to recover. Then we use a different snapshot. When the dev instance starts up, then we immediately delete everything from repmgr nodes, and disable repmgr completely. Today we noticed that the dev db was created in a network that made it possible to connect to the prod primary. (dev db's network was not separated from the prod db's network, fixed today). The dev db might have connected to the prod/primary after startup. But that dev instance was created 7 days ago, so probably it is not related.

What I know about ZFS would fit in the navel of flea, so someone else will have to comment on this.


Dev connected to prod/primary how?



Actually, we need this standby because we are using zfs snapshots for making frequent database backups. Without this standby, we can't make frequent backups. (The primary and the other standby are not using zfs.)

Thanks,

    Laszlo


pgsql-general by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: pg_getnameinfo_all() failed: Temporary failure in name resolution
Next
From: Ron Johnson
Date:
Subject: Re: pg_getnameinfo_all() failed: Temporary failure in name resolution