Re: replication primary writting infinite number of WAL files - Mailing list pgsql-general

From Les
Subject Re: replication primary writting infinite number of WAL files
Date
Msg-id CAKXe9UBxKm3pTTUVmoOHMM96kGzN=Y3rHoP53dD=FytMzM8bPQ@mail.gmail.com
Whole thread Raw
In response to Re: replication primary writting infinite number of WAL files  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: replication primary writting infinite number of WAL files
List pgsql-general


A single sequence for all id columns across all tables?

How is the sequence value landing in the id column?

In most cases it is by using "nextval(seq_name)" in the SQL statement. But sometimes the sequence value is taken first, and then multiple inserts are sent with fixed increasing values. (Sometimes records reference each other, such records are inserted within a short transaction with deferred foreign key constraints.)


It might be possible to "select nextval" 1M times per second from an Pl/SQL function that does nothing else in an infinite loop. But that would not write too much data on the disk.

On an old laptop:

create table insert_test(id int);

insert into insert_test select val from generate_series(1, 1000000) as t(val);
INSERT 0 1000000
Time: 943.918 ms

I would  say there is more then just the id being inserted, unless all the other fields allow NULL.

 

And there is nothing in the logs in that time period besides "checkpoints are happening too frequently"?



The "set transaction" warning is due to a bug in an application that calls SET TRANSACTION before starting a transaction.

And what is that app doing when it does SET TRANSACTION?

It is the main app that is using the database, using jdbc. Multiple instances were running when the sequence jumped. 




How are the snapshots being taken?
zfs snapshot is taken on the standby, then zfs clone is created on the snapshot, and a new postgresql instance is started on top of the clone. It recovers within one minute. In a very few cases (two or three times in a year), it fails to recover. Then we use a different snapshot. When the dev instance starts up, then we immediately delete everything from repmgr nodes, and disable repmgr completely. Today we noticed that the dev db was created in a network that made it possible to connect to the prod primary. (dev db's network was not separated from the prod db's network, fixed today). The dev db might have connected to the prod/primary after startup. But that dev instance was created 7 days ago, so probably it is not related.

What I know about ZFS would fit in the navel of flea, so someone else will have to comment on this.


Dev connected to prod/primary how?

When the dev db starts up (from the cloned data directory), it still has the old repmgr conf. That config is deleted, and repmgr is disabled right after the startup, but there is the possibility that the dev db has connected the primary when it was cloned (7 days ago), because at the beginning of startup, it is the exact clone of the standby from a previous point of time.

   Laszlo

pgsql-general by date:

Previous
From: Ron Johnson
Date:
Subject: Re: pg_getnameinfo_all() failed: Temporary failure in name resolution
Next
From: Adrian Klaver
Date:
Subject: Re: replication primary writting infinite number of WAL files