On Fri, May 25, 2018 at 7:28 AM, Michael Paquier <michael@paquier.xyz> wrote:
Hi all,
When attempting to use multiple times pg_replication_slot_advance on a slot, then the caller gets back directly InvalidXLogRecPtr as result, for example: =# select * from pg_replication_slot_advance('popo', 'FF/0'); slot_name | end_lsn -----------+----------- popo | 0/60021E0 (1 row) =# select * from pg_replication_slot_advance('popo', 'FF/0'); slot_name | end_lsn -----------+--------- popo | 0/0 (1 row)
Wouldn't it be more simple to return NULL to mean that the slot could not be moved forward? That would be easier to parse for clients. Please see the attached.
I agree that returning 0/0 on this is wrong.
However, can this actually occour for any case other than exactly the case of "moving the position to where the position already is"? If I look at the physical slot path at least that seems to eb the only case, and in that case I think the correct thing to return would be the new position, and not NULL. If we actually *fail* to move the position, we give an error.
Actually, isn't there also a race there? That is, if we try to move it, we check that we're not trying to move it backwards, and throw an error, but that's checked outside the lock. Then later we actually move it, and check *again* if we try to move it backwards, but if that one fails we return InvalidXLogRecPtr (which can happen in the case of concurrent activity on the slot, I think)? In this case, maybe we should just re-check that and raise an error appropriately?
(I haven't looked at the logical slot path, but I assume it would have something similar in it)