Re: pg_replication_slot_advance to return NULL instead of 0/0 if slotnot advanced - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: pg_replication_slot_advance to return NULL instead of 0/0 if slotnot advanced
Date
Msg-id CABUevEz2KqxByz=ZPq1RQn7Jv4HpngTJ2yCNCaC92in14dUdQw@mail.gmail.com
Whole thread Raw
In response to pg_replication_slot_advance to return NULL instead of 0/0 if slotnot advanced  (Michael Paquier <michael@paquier.xyz>)
Responses Re: pg_replication_slot_advance to return NULL instead of 0/0 if slotnot advanced
Re: pg_replication_slot_advance to return NULL instead of 0/0 ifslot not advanced
List pgsql-hackers


On Fri, May 25, 2018 at 7:28 AM, Michael Paquier <michael@paquier.xyz> wrote:
Hi all,

When attempting to use multiple times pg_replication_slot_advance on a
slot, then the caller gets back directly InvalidXLogRecPtr as result,
for example:
=# select * from pg_replication_slot_advance('popo', 'FF/0');
 slot_name |  end_lsn
-----------+-----------
 popo      | 0/60021E0
(1 row)
=# select * from pg_replication_slot_advance('popo', 'FF/0');
 slot_name | end_lsn
-----------+---------
 popo      | 0/0
(1 row)

Wouldn't it be more simple to return NULL to mean that the slot could
not be moved forward?  That would be easier to parse for clients.
Please see the attached.

I agree that returning 0/0 on this is wrong.

However, can this actually occour for any case other than exactly the case of "moving the position to where the position already is"? If I look at the physical slot path at least that seems to eb the only case, and in that case I think the correct thing to return would be the new position, and not NULL. If we actually *fail* to move the position, we give an error.

Actually, isn't there also a race there? That is, if we try to move it, we check that we're not trying to move it backwards, and throw an error, but that's checked outside the lock. Then later we actually move it, and check *again* if we try to move it backwards, but if that one fails we return InvalidXLogRecPtr (which can happen in the case of concurrent activity on the slot, I think)? In this case, maybe we should just re-check that and raise an error appropriately?

(I haven't looked at the logical slot path, but I assume it would have something similar in it)

--

pgsql-hackers by date:

Previous
From: "Moon, Insung"
Date:
Subject: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Next
From: Pierre-Emmanuel André
Date:
Subject: Re: PostgreSQL 11 beta1 : regressions failed on OpenBSD with JIT