Big UPDATE breaking replication - Mailing list pgsql-admin

From Kouber Saparev
Subject Big UPDATE breaking replication
Date
Msg-id 51ADD54F.3030702@saparev.com
Whole thread Raw
Responses Re: Big UPDATE breaking replication  (Albe Laurenz <laurenz.albe@wien.gv.at>)
Re: Big UPDATE breaking replication  (Steve Crawford <scrawford@pinpointresearch.com>)
List pgsql-admin
Hello,

We are using the 9.1 built-in streaming replication.

Recently our slave nodes fell behind because of an UPDATE statement. It
took about 3 minutes to execute, but it affected half a million records,
hence the replication broke with the "requested WAL segment ... has
already been removed" series of error messages.

The WAL settings we have are:

max_wal_senders = 6
wal_keep_segments = 60
max_standby_archive_delay = 300s


I guess increasing the wal_keep_segments value would prevent it from
happening in the future, but increase it with how much? What value would
be high enough?

Also we noticed some strange error message appearing shortly before and
after this same statement: "LOG:  out of file descriptors: Too many open
files; release and retry".

Could it be related somehow and what does it mean exactly?

Here's an excerpt from the master DB log:


May 30 12:23:09 DB1 postgres[28201]: [13-1] user=www,db=xxx LOG:  out of
file descriptors: Too many open files; release and retry
May 30 12:23:09 DB1 postgres[28201]: [13-2] user=www,db=xxx CONTEXT:
writing block 0 of relation base/2819385/2820788
May 30 12:23:09 DB1 postgres[28201]: [13-3] user=www,db=xxx STATEMENT:
UPDATE
May 30 12:23:09 DB1 postgres[28201]: [13-4] ^I              message
May 30 12:23:09 DB1 postgres[28201]: [13-5] ^I            SET
May 30 12:23:09 DB1 postgres[28201]: [13-6] ^I
sender_has_deleted=TRUE,
May 30 12:23:09 DB1 postgres[28201]: [13-7] ^I
receiver_has_deleted=TRUE
May 30 12:23:09 DB1 postgres[28201]: [13-8] ^I            WHERE
from_profile_sid=870

...

May 30 12:39:47 DB1 postgres[9053]: [2-1] user=postgres,db=[unknown]
FATAL:  requested WAL segment 00000001000002DE000000BD has already been
removed


Regards,
--
Kouber Saparev


pgsql-admin by date:

Previous
From: prakhar jauhari
Date:
Subject: Re: Steps to switch from Master to standby mode :
Next
From: Albe Laurenz
Date:
Subject: Re: Big UPDATE breaking replication