Home > mailing lists

Re: Streaming replication and a disk full in primary - Mailing list pgsql-hackers

From	Fujii Masao
Subject	Re: Streaming replication and a disk full in primary
Date	April 12, 2010 12:39:50
Msg-id	g2m3f0b79eb1004120539s59fd98a7q2c3176ae797f2fe1@mail.gmail.com Whole thread Raw
In response to	Re: Streaming replication and a disk full in primary (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List	pgsql-hackers

Tree view

On Mon, Apr 12, 2010 at 7:41 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
>> We should remove the document "25.2.5.2. Monitoring"?
>
> I updated it to no longer claim that the primary can run out of disk
> space because of a hung WAL sender. The information about calculating
> the lag between primary and standby still seems valuable, so I didn't
> remove the whole section.

Yes.

> !      An important health indicator of streaming replication is the amount
> !      of WAL records generated in the primary, but not yet applied in the
> !      standby.

Since pg_last_xlog_receive_location doesn't let us know the WAL location
not yet applied, we should use pg_last_xlog_replay_location instead. How
How about?:

----------------     An important health indicator of streaming replication is the amount     of WAL records generated
inthe primary, but not yet applied in the     standby. You can calculate this lag by comparing the current WAL write
 
-     location on the primary with the last WAL location received by the
+     location on the primary with the last WAL location replayed by the     standby. They can be retrieved using
<function>pg_current_xlog_location</>on the primary and the
 
-     <function>pg_last_xlog_receive_location</> on the standby,
+     <function>pg_last_xlog_replay_location</> on the standby,     respectively (see <xref
linkend="functions-admin-backup-table">and     <xref linkend="functions-recovery-info-table"> for details).
 
-     The last WAL receive location in the standby is also displayed in the
-     process status of the WAL receiver process, displayed using the
-     <command>ps</> command (see <xref linkend="monitoring-ps"> for details).    </para>   </sect3>
----------------

>> Why is standby_keep_segments used even if max_wal_senders is zero?
>> In that case, ISTM we don't need to keep any WAL files in pg_xlog
>> for the standby.
>
> True. I don't think we should second guess the admin on that, though.
> Perhaps he only set max_wal_senders=0 temporarily, and will be
> disappointed if the the logs are no longer there when he sets it back to
> non-zero and restarts the server.

OK. Since the behavior is not intuitive for me, I'd like to add the note
into the end of the description about "standby_keep_segments". How about?:

----------------
This setting has effect if max_wal_senders is zero.
----------------

>> When walreceiver has gotten stuck for some reason, walsender would be
>> unable to pass through the send() system call, and also get stuck.
>> In the patch, such a walsender cannot exit forever because it cannot
>> call XLogRead(). So I think that the bgwriter needs to send the
>> exit-signal to such a too lagged walsender. Thought?
>
> Any backend can get stuck like that.

OK.

> +     },
> +
> +     {
> +         {"standby_keep_segments", PGC_SIGHUP, WAL_CHECKPOINTS,
> +             gettext_noop("Sets the number of WAL files held for standby servers"),
> +             NULL
> +         },
> +         &StandbySegments,
> +         0, 0, INT_MAX, NULL, NULL

We should s/WAL_CHECKPOINTS/WAL_REPLICATION ?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

pgsql-hackers by date:

From: Jim Mlodgenski
Date: 12 April 2010, 12:32:53
Subject: Re: testing HS/SR - 1 vs 2 performance

From: "Erik Rijkers"
Date: 12 April 2010, 12:58:28
Subject: Re: testing HS/SR - 1 vs 2 performance

Re: Streaming replication and a disk full in primary - Mailing list pgsql-hackers

Previous

Next