Non-reserved replication slots and slot advancing - Mailing list pgsql-hackers

From Michael Paquier
Subject Non-reserved replication slots and slot advancing
Date
Msg-id 20180626071305.GH31353@paquier.xyz
Whole thread Raw
Responses Re: Non-reserved replication slots and slot advancing
List pgsql-hackers
Hi all,

I have been chewing for the last couple of days on this email from
Horiguchi-san:
https://www.postgresql.org/message-id/20180622.163312.254556300.horiguchi.kyotaro@lab.ntt.co.jp

As summarized, it is actually strange to be able to advance a slot which
has a non-reserved restart_lsn.  For example, take that which can happen
on HEAD:
=# select pg_create_physical_replication_slot('toto');
 pg_create_physical_replication_slot
-------------------------------------
 (toto,)
(1 row)
=# select pg_replication_slot_advance('toto', '0/1');
 pg_replication_slot_advance
-----------------------------
 (toto,0/1)
(1 row)
=# select slot_name, restart_lsn from pg_replication_slots ;
 slot_name | restart_lsn
-----------+-------------
 toto      | 0/1
(1 row)

KeepLogSeg() is careful enough to not go past LSN positions which do not
exist anymore in the backend, and a client trying to consume a LSN to a
position which does not exist would get back an error to the backend
mentioning that the wanted segment does not exist anymore.  Hence it
seems to me that being to advance a slot for an unreserved slot should
result in an error.  At least, that's easier to restrict things this
way as it could still be possible to advance a slot up to a position
where a WAL segment is still available on the backend.  But I think that
allowing such a thing would be full of race conditions with WAL
recycling and I am not sure about the use case.

I think that documenting the fast that restart_lsn is NULL for a slot
which has not been reserved yet would be nice.  Attached is an idea of
patch, ideas are welcome.  An open item has been added as well.

Thanks,
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Let's remove DSM_IMPL_NONE.
Next
From: Thomas Munro
Date:
Subject: Re: Excessive CPU usage in StandbyReleaseLocks()