Re: [HACKERS] Restricting maximum keep segments by repslots - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] Restricting maximum keep segments by repslots
Date
Msg-id CAD21AoBdfoLSgujPZ_TpnH5zdQz0jg-Y8OXtZ=TCO787Sey-=w@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Restricting maximum keep segments by repslots  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: [HACKERS] Restricting maximum keep segments by repslots  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
On Thu, Sep 13, 2018 at 6:30 PM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>
> Hello.
>
> Thank you for the comments, Sawada-san, Peter.
>
> At Mon, 10 Sep 2018 19:52:24 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote
in<20180910.195224.22629595.horiguchi.kyotaro@lab.ntt.co.jp>
 
> > At Thu, 6 Sep 2018 22:32:21 +0200, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in
<29bbd79d-696b-509e-578a-0fc38a3b9405@2ndquadrant.com>
> > Thanks for pointing that. That's a major cause of confusion. Does
> > the following make sense?
> >
> > > Specify the maximum size of WAL files that <link
> > > linkend="streaming-replication-slots">replication slots</link>
> > > are allowed to retain in the <filename>pg_wal</filename>
> > > directory at checkpoint time.  If
> > > <varname>max_slot_wal_keep_size</varname> is zero (the
> > > default), replication slots retain unlimited size of WAL files.
> > + If restart_lsn of a replication slot gets behind more than that
> > + bytes from the current LSN, the standby using the slot may not
> > + be able to reconnect due to removal of required WAL records.
> ...
> > > Also, I don't think 0 is a good value for the default behavior.  0 would
> > > mean that a slot is not allowed to retain any more WAL than already
> > > exists anyway.  Maybe we don't want to support that directly, but it's a
> > > valid configuration.  So maybe use -1 for infinity.
> >
> > In realtion to the reply just sent to Sawada-san, remain of a
> > slot can be at most 16MB in the 0 case with the default segment
> > size. So you're right in this sense. Will fix in the coming
> > version. Thanks.
>
> I did the following thinkgs in the new version.
>
> - Changed the disable (or infinite) and default value of
>   max_slot_wal_keep_size to -1 from 0.
>   (patch 1, 2. guc.c, xlog.c: GetOldestKeepSegment())
>
> - Fixed documentation for max_slot_wal_keep_size tomention what
>   happnes when WAL exceeds the size, and additional rewrites.
>   (patch 4, catalogs.sgml, config.sgml)
>
> - Folded parameter list of GetOldestKeepSegment().
>   (patch 1, 2. xlog.c)
>
> - Provided the plural form of errdetail of checkpoint-time
>   warning.  (patch 1, xlog.c: KeepLogSeg())
>
> - Some cosmetic change and small refactor.
>   (patch 1, 2, 3)
>

Sorry for the late response. The patch still can be applied to the
curent HEAD so I reviewed the latest patch.

The value of 'remain' and 'wal_status' might not be correct. Although
'wal_stats' shows 'lost' but we can get changes from the slot. I've
tested it with the following steps.

=# alter system set max_slot_wal_keep_size to '64MB'; -- while
wal_keep_segments is 0
=# select pg_reload_conf();
=# select slot_name, wal_status, remain, pg_size_pretty(remain) as
remain_pretty from pg_replication_slots ;
 slot_name | wal_status |  remain  | remain_pretty
-----------+------------+----------+---------------
 1         | streaming  | 83885648 | 80 MB
(1 row)

** consume 80MB WAL, and do CHECKPOINT **

=# select slot_name, wal_status, remain, pg_size_pretty(remain) as
remain_pretty from pg_replication_slots ;
 slot_name | wal_status | remain | remain_pretty
-----------+------------+--------+---------------
 1         | lost       |      0 | 0 bytes
(1 row)
=# select count(*) from pg_logical_slot_get_changes('1', NULL, NULL);
 count
-------
    15
(1 row)

-----
I got the following result with setting of wal_keep_segments >
max_slot_keep_size. The 'wal_status' shows 'streaming' although the
'remain' is 0.

=# select slot_name, wal_status, remain from pg_replication_slots limit 1;
 slot_name | wal_status | remain
-----------+------------+--------
 1         | streaming  |      0
(1 row)

+               XLByteToSeg(targetLSN, restartSeg, wal_segment_size);
+               if (max_slot_wal_keep_size_mb >= 0 && currSeg <=
restartSeg + limitSegs)
+               {

You use limitSegs here but shouldn't we use keepSeg instead? Actually
I've commented this point for v6 patch before[1], and this had been
fixed in the v7 patch. However you're using limitSegs again from v8
patch again. I might be missing something though.

Changed the status to 'Waiting on Author'.

[1] https://www.postgresql.org/message-id/CAD21AoD0rChq7wQE%3D_o95quopcQGjcVG9omwdH07nT5cm81hzg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/20180904.195250.144186960.horiguchi.kyotaro%40lab.ntt.co.jp

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATIONNTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: Contribution to postgresql
Next
From: Dilip Kumar
Date:
Subject: Re: Side effect of CVE-2017-7484 fix?