Thread: Catalog_xmin is not advanced when a logical slot is lost
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary |
active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
-----------+---------------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------
s2 | test_decoding | logical | 5 | postgres | f | f
| | | 771 | 0/30466368 | 0/304663A0
| reserved | 28903824 | f
(1 row)
postgres=# create table t2(c int, c1 char(100));
CREATE TABLE
postgres=# drop table t2;
DROP TABLE
postgres=# vacuum pg_class;
VACUUM
postgres=# select n_dead_tup from pg_stat_all_tables where relname =
'pg_class';
n_dead_tup
------------
2
(1 row)
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr |
client_hostname | client_port | backend_start | backend_xmin | state |
sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag |
replay_lag | sync_pri
ority | sync_state | reply_time
-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+--------------+-------+----------+-----------+-----------+------------+-----------+-----------+------------+---------
------+------------+------------
(0 rows)
postgres=# insert into t1 select * from t1;
INSERT 0 2097152
postgres=# checkpoint;
CHECKPOINT
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary |
active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
-----------+---------------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------
s2 | test_decoding | logical | 5 | postgres | f | f
| | | 771 | | 0/304663A0
| lost | | f
(1 row)
postgres=# vacuum pg_class;
VACUUM
postgres=# select n_dead_tup from pg_stat_all_tables where relname =
'pg_class';
n_dead_tup
------------
0
(1 row)
Attachment
On 2022-Nov-20, sirisha chamarthi wrote: > Hi Hackers, > > forking this thread from the discussion [1] as suggested by Amit. > > Catalog_xmin is not advanced when a logical slot is invalidated (lost) > until the invalidated slot is dropped. This patch ignores invalidated slots > while computing the oldest xmin. Attached a small patch to address this and > the output after the patch is as shown below. Oh wow, that's bad :-( I'll get it patched immediately. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Here's a general engineering tip: if the non-fun part is too complex for you to figure out, that might indicate the fun part is too ambitious." (John Naylor) https://postgr.es/m/CAFBsxsG4OWHBbSDM%3DsSeXrQGOtkPiOEOuME4yD7Ce41NtaAD9g%40mail.gmail.com
Hi Sirisha, Thanks for identifying the bug and the solution. Some review comments inlined. On Mon, Nov 21, 2022 at 2:49 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2022-Nov-20, sirisha chamarthi wrote: > > > Hi Hackers, > > > > forking this thread from the discussion [1] as suggested by Amit. > > > > Catalog_xmin is not advanced when a logical slot is invalidated (lost) > > until the invalidated slot is dropped. This patch ignores invalidated slots > > while computing the oldest xmin. Attached a small patch to address this and > > the output after the patch is as shown below. > > Oh wow, that's bad :-( I'll get it patched immediately. + /* ignore invalid slots while computing the oldest xmin */ + if (TransactionIdIsValid(invalidated_at_lsn)) + continue; I think the condition should be if (!XLogRecPtrIsInvalid(invalidated_at_lsn)) LSN and XID are different data types. and to be inline with pg_get_replication_slots() 361 if (XLogRecPtrIsInvalid(slot_contents.data.restart_lsn) && 362 !XLogRecPtrIsInvalid(slot_contents.data.invalidated_at)) 363 walstate = WALAVAIL_REMOVED; we should also check restart_lsn. I would write this as bool invalidated_slot = false; then under spinlock invalidated_slot = XLogRecPtrIsInvalid(s->data.restart_lsn) && !XLogRecPtrIsInvalid(s->data.invalidated_at); if (invalidated_slot) continue. -- Best Wishes, Ashutosh Bapat
On 2022-Nov-21, Ashutosh Bapat wrote: > I think the condition should be > > if (!XLogRecPtrIsInvalid(invalidated_at_lsn)) LSN and XID are > different data types. Yeah, this bit is wrong. I agree with your suggestion to just keep a boolean flag, as we don't need more than that. > and to be inline with pg_get_replication_slots() > 361 if (XLogRecPtrIsInvalid(slot_contents.data.restart_lsn) && > 362 !XLogRecPtrIsInvalid(slot_contents.data.invalidated_at)) > 363 walstate = WALAVAIL_REMOVED; > > we should also check restart_lsn. Hmm, I'm not sure about this one. I'm not sure why we check both in pg_get_replication_slots. I suppose we didn't want to ignore a slot only if it had a non-zero invalidated_at in case it was accidental (say, we initialize a slot as valid, but forget to zero-out the invalidated_at value); but I think that's pretty much useless. This is only changed with the spinlock held, so it's not like you can see partially-set state. In fact, as I recall we could replace invalidated_at in ReplicationSlotPersistentData with a boolean "invalidated" flag, and leave restart_lsn alone when invalidated. IIRC the only reason we didn't do it that way was that we feared some code might observe some valid value in restart_lsn without noticing that it belonged to an invalidate slot. (Which is exactly what happened now, except with a different field.) -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
On Mon, Nov 21, 2022 at 5:39 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2022-Nov-21, Ashutosh Bapat wrote: > > > I think the condition should be > > > > if (!XLogRecPtrIsInvalid(invalidated_at_lsn)) LSN and XID are > > different data types. > > Yeah, this bit is wrong. I agree with your suggestion to just keep a > boolean flag, as we don't need more than that. > > > and to be inline with pg_get_replication_slots() > > 361 if (XLogRecPtrIsInvalid(slot_contents.data.restart_lsn) && > > 362 !XLogRecPtrIsInvalid(slot_contents.data.invalidated_at)) > > 363 walstate = WALAVAIL_REMOVED; > > > > we should also check restart_lsn. > > Hmm, I'm not sure about this one. I'm not sure why we check both in > pg_get_replication_slots. I suppose we didn't want to ignore a slot > only if it had a non-zero invalidated_at in case it was accidental (say, > we initialize a slot as valid, but forget to zero-out the invalidated_at > value); but I think that's pretty much useless. This is only changed > with the spinlock held, so it's not like you can see partially-set > state. > > In fact, as I recall we could replace invalidated_at in > ReplicationSlotPersistentData with a boolean "invalidated" flag, and > leave restart_lsn alone when invalidated. IIRC the only reason we > didn't do it that way was that we feared some code might observe some > valid value in restart_lsn without noticing that it belonged to an > invalidate slot. (Which is exactly what happened now, except with a > different field.) > Maybe. In that case pg_get_replication_slots() should be changed. We should use the same criteria to decide whether a slot is invalidated or not at all the places. I am a fan of stricter, all-assumption-covering conditions. In case we don't want to check restart_lsn, an Assert might be useful to validate our assumption. -- Best Wishes, Ashutosh Bapat
On 2022-Nov-21, Ashutosh Bapat wrote: > Maybe. In that case pg_get_replication_slots() should be changed. We > should use the same criteria to decide whether a slot is invalidated > or not at all the places. Right. > I am a fan of stricter, all-assumption-covering conditions. In case we > don't want to check restart_lsn, an Assert might be useful to validate > our assumption. Agreed. I'll throw in an assert. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On 2022-Nov-21, Ashutosh Bapat wrote:
> Maybe. In that case pg_get_replication_slots() should be changed. We
> should use the same criteria to decide whether a slot is invalidated
> or not at all the places.
Right.
> I am a fan of stricter, all-assumption-covering conditions. In case we
> don't want to check restart_lsn, an Assert might be useful to validate
> our assumption.
Agreed. I'll throw in an assert.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Attachment
On 2022-Nov-21, sirisha chamarthi wrote: > > > I am a fan of stricter, all-assumption-covering conditions. In case we > > > don't want to check restart_lsn, an Assert might be useful to validate > > > our assumption. > > > > Agreed. I'll throw in an assert. > > Changed this in the patch to throw an assert. Thank you. I had pushed mine for CirrusCI to test, and it failed the assert I added in slot.c: https://cirrus-ci.com/build/4786354503548928 Not yet sure why, looking into it. You didn't add any asserts to the slot.c code. In slotfuncs.c, I'm not sure I want to assert anything about restart_lsn in any cases other than when invalidated_at is set. In other words, I prefer this coding in pg_get_replication_slots: if (!XLogRecPtrIsInvalid(slot_contents.data.invalidated_at)) { Assert(XLogRecPtrIsInvalid(slot_contents.data.restart_lsn)); walstate = WALAVAIL_REMOVED; } else walstate = GetWALAvailability(slot_contents.data.restart_lsn); Your proposal is doing this: switch (walstate) { [...] case WALAVAIL_REMOVED: if (!XLogRecPtrIsInvalid(slot_contents.data.restart_lsn)) { [...] if (pid != 0) [...] break; } Assert(XLogRecPtrIsInvalid(slot_contents.data.restart_lsn)); which sounds like it could be hit if the replica is connected to the slot. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On 2022-Nov-21, sirisha chamarthi wrote:
> > > I am a fan of stricter, all-assumption-covering conditions. In case we
> > > don't want to check restart_lsn, an Assert might be useful to validate
> > > our assumption.
> >
> > Agreed. I'll throw in an assert.
>
> Changed this in the patch to throw an assert.
Thank you. I had pushed mine for CirrusCI to test, and it failed the
assert I added in slot.c:
https://cirrus-ci.com/build/4786354503548928
Not yet sure why, looking into it.
pg_receivewal: disconnected; waiting 5 seconds to try again
pg_receivewal: error: unexpected termination of replication stream: ERROR: requested WAL segment 0000000100000000000000EB has already been removed
pg_receivewal: disconnected; waiting 5 seconds to try again
^Cpostgres@pgvm:~$ /usr/local/pgsql/bin/psql
psql (16devel)
Type "help" for help.
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
-----------+---------------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------
s3 | test_decoding | logical | 5 | postgres | f | f | | | 769 | | 0/A992E7D0 | lost | | f
s5 | | physical | | | f | f | | | | 0/EB000000 | | lost | | f
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On 2022-Nov-21, sirisha chamarthi wrote: > On Mon, Nov 21, 2022 at 8:05 AM Alvaro Herrera <alvherre@alvh.no-ip.org> > wrote: > > Thank you. I had pushed mine for CirrusCI to test, and it failed the > > assert I added in slot.c: > > https://cirrus-ci.com/build/4786354503548928 > > Not yet sure why, looking into it. > > Can this be because restart_lsn is not set to InvalidXLogRecPtr for the > physical slots? Hmm, that makes no sense. Is that yet another bug? Looking. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "No es bueno caminar con un hombre muerto"
On 2022-Nov-21, sirisha chamarthi wrote:
> On Mon, Nov 21, 2022 at 8:05 AM Alvaro Herrera <alvherre@alvh.no-ip.org>
> wrote:
> > Thank you. I had pushed mine for CirrusCI to test, and it failed the
> > assert I added in slot.c:
> > https://cirrus-ci.com/build/4786354503548928
> > Not yet sure why, looking into it.
>
> Can this be because restart_lsn is not set to InvalidXLogRecPtr for the
> physical slots?
Hmm, that makes no sense. Is that yet another bug? Looking.
pg_create_physical_replication_slot
-------------------------------------
(s1,)
(1 row)
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------
s1 | | physical | | | f | f | | | | | | | -8254390272 | f
(1 row)
postgres=# checkpoint;
CHECKPOINT
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------
s1 | | physical | | | f | f | | | | | | | -8374095064 | f
(1 row)
postgres=# \q
postgres@pgvm:~$ /usr/local/pgsql/bin/pg_receivewal -S s1 -D .
pg_receivewal: error: unexpected termination of replication stream: ERROR: requested WAL segment 0000000100000000000000EB has already been removed
pg_receivewal: disconnected; waiting 5 seconds to try again
^Cpostgres@pgvm:~$ /usr/local/pgsql/bin/psql
psql (16devel)
Type "help" for help.
postgres=# select * from pg_replication_slots;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
!?> ^C
!?>
TRAP: failed Assert("XLogRecPtrIsInvalid(slot_contents.data.restart_lsn)"), File: "slotfuncs.c", Line: 371, PID: 3953707
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"No es bueno caminar con un hombre muerto"
On 2022-Nov-21, sirisha chamarthi wrote: > It appears to be. wal_sender is setting restart_lsn to a valid LSN even > when the slot is invalidated. > postgres@pgvm:~$ /usr/local/pgsql/bin/pg_receivewal -S s1 -D . > pg_receivewal: error: unexpected termination of replication stream: ERROR: > requested WAL segment 0000000100000000000000EB has already been removed > pg_receivewal: disconnected; waiting 5 seconds to try again > ^Cpostgres@pgvm:~$ /usr/local/pgsql/bin/psql > psql (16devel) > Type "help" for help. > > postgres=# select * from pg_replication_slots; > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. Whoa, I cannot reproduce this :-( -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Java is clearly an example of money oriented programming" (A. Stepanov)
On 2022-Nov-21, sirisha chamarthi wrote:
> It appears to be. wal_sender is setting restart_lsn to a valid LSN even
> when the slot is invalidated.
> postgres@pgvm:~$ /usr/local/pgsql/bin/pg_receivewal -S s1 -D .
> pg_receivewal: error: unexpected termination of replication stream: ERROR:
> requested WAL segment 0000000100000000000000EB has already been removed
> pg_receivewal: disconnected; waiting 5 seconds to try again
> ^Cpostgres@pgvm:~$ /usr/local/pgsql/bin/psql
> psql (16devel)
> Type "help" for help.
>
> postgres=# select * from pg_replication_slots;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
Whoa, I cannot reproduce this :-(
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------
s2 | | physical | | | f | f | | | | 2/DC000000 | | lost | | f
(1 row)
postgres=# \q
postgres@pgvm:~$ ls
0000000100000002000000D8 0000000100000002000000D9 0000000100000002000000DA 0000000100000002000000DB 0000000100000002000000DC.partial
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Java is clearly an example of money oriented programming" (A. Stepanov)
On Mon, Nov 21, 2022 at 10:11 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:On 2022-Nov-21, sirisha chamarthi wrote:
> It appears to be. wal_sender is setting restart_lsn to a valid LSN even
> when the slot is invalidated.
> postgres@pgvm:~$ /usr/local/pgsql/bin/pg_receivewal -S s1 -D .
> pg_receivewal: error: unexpected termination of replication stream: ERROR:
> requested WAL segment 0000000100000000000000EB has already been removed
> pg_receivewal: disconnected; waiting 5 seconds to try again
> ^Cpostgres@pgvm:~$ /usr/local/pgsql/bin/psql
> psql (16devel)
> Type "help" for help.
>
> postgres=# select * from pg_replication_slots;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
Whoa, I cannot reproduce this :-(I have a old .partial file in the data directory to reproduce this.postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------
s2 | | physical | | | f | f | | | | 2/DC000000 | | lost | | f
(1 row)
postgres=# \q
postgres@pgvm:~$ ls
0000000100000002000000D8 0000000100000002000000D9 0000000100000002000000DA 0000000100000002000000DB 0000000100000002000000DC.partial
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Java is clearly an example of money oriented programming" (A. Stepanov)
On 2022-Nov-21, sirisha chamarthi wrote: > I have a old .partial file in the data directory to reproduce this. I don't think the .partial file is in itself important. But I think this whole thing is a distraction. I managed to reproduce it eventually, by messing with the slot and WAL at random, and my conclusion is that we shouldn't mess with this at all for this bugfix. Instead I'm going to do what Ashutosh mentioned at the start, which is to verify both the restart_lsn and the invalidated_at, when deciding whether to ignore the slot. It seems to me that there is a bigger mess here, considering that we use the effective_xmin in some places and the other xmin (the one that's saved to disk) in others. I have no patience for trying to disentangle that at this point, though. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Having your biases confirmed independently is how scientific progress is made, and hence made our great society what it is today" (Mary Gardiner)
On 2022-Nov-21, sirisha chamarthi wrote:
> I have a old .partial file in the data directory to reproduce this.
I don't think the .partial file is in itself important. But I think
this whole thing is a distraction.
I managed to reproduce it
eventually, by messing with the slot and WAL at random, and my
conclusion is that we shouldn't mess with this at all for this bugfix.
Instead I'm going to do what Ashutosh mentioned at the start, which is
to verify both the restart_lsn and the invalidated_at, when deciding
whether to ignore the slot.
It seems to me that there is a bigger mess here, considering that we use
the effective_xmin in some places and the other xmin (the one that's
saved to disk) in others. I have no patience for trying to disentangle
that at this point, though.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Having your biases confirmed independently is how scientific progress is
made, and hence made our great society what it is today" (Mary Gardiner)
On 2022-Nov-21, sirisha chamarthi wrote: > On Mon, Nov 21, 2022 at 10:56 AM Alvaro Herrera <alvherre@alvh.no-ip.org> > wrote: > > Instead I'm going to do what Ashutosh mentioned at the start, which is > > to verify both the restart_lsn and the invalidated_at, when deciding > > whether to ignore the slot. > > Sounds good to me. Thanks! Done now. I also a new elog(DEBUG1), which I think makes the issue a bit easier notice. I think it would be even better if we reset the underlying data from effective_catalog_xmin ... even with this patch, we show a non-zero value for a slot in status "lost" (and we ignore it when computing the overall xmin), which I think is quite confusing. But we can do that in master only. Thanks for reporting this issue. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/