Re: pgsql: Fix "base" snapshot handling in logical decoding - Mailing list pgsql-hackers

From Arseny Sher
Subject Re: pgsql: Fix "base" snapshot handling in logical decoding
Date
Msg-id 87r2kliqja.fsf@ars-thinkpad
Whole thread Raw
In response to Re: pgsql: Fix "base" snapshot handling in logical decoding  (Arseny Sher <a.sher@postgrespro.ru>)
Responses Re: pgsql: Fix "base" snapshot handling in logical decoding
Re: pgsql: Fix "base" snapshot handling in logical decoding
List pgsql-hackers
Arseny Sher <a.sher@postgrespro.ru> writes:

> There is also one thing that puzzles me as I don't know much about
> vacuum internals. If I do plain VACUUM of pg_attribute in the test, it
> shouts "catalog is missing 1 attribute(s) for relid" error (which is
> quite expected), while with 'VACUUM FULL pg_attribute' the tuple is
> silently (and wrongly, with dropped column missing) decoded. Moreover,
> if I perform the test manually, and do 'VACUUM FULL;', sometimes test
> becomes useless -- that is, tuple is successfully decoded with all three
> columns, as though VACUUM was not actually executed. All this is without
> the main patch, of course. I think I will look into this soon.

So I have been jumping around this and learned a few curious things.

1) Test in its current shape sometimes doesn't fulfill its aim indeed --
  that is, despite issued VACUUM the tuple is still decoded with all
  three fields. This happens because during attempt to advance xmin
  there is a good possiblity to encounter xl_running_xacts record logged
  before our CHECKPOINT (they are logged each 15 seconds). We do not
  try to advance xmin twice without client acknowledgment, so in this
  case xmin will not be advanced far enough to allow vacuum prune entry
  from pg_attribute.

2) This is not easy to notice because often (but not always) explicit
   VACUUM is not needed at all: tuple is often pruned by microvacuum
   (heap_page_prune_opts) right in the final decoding session. If we
   hadn't bumped xmin far enough during previous get_changes, we do that
   now, so microvacuum actually purges the entry. But if we were so
   unfortunate that 1) extra xl_running_xacts was logged and 2)
   microvacuum was in bad mood and didn't come, pg_attribute is not
   vacuumed and test becomes useless. To make this bulletproof, in the
   attached patch I doubled first get_changes: now there are two client
   acks, so our VACUUM always does the job.

3) As a side note, answer to my question 'why do we get different errors
   with VACUUM and VACUUM FULL' is the following. With VACUUM FULL, not
   only old pg_attribute entry is pruned, but also xmin of new entry
   with attisdropped=true is reset to frozen xid. This means that
   decoding session (RelationBuildTupleDesc) actually sees 3 attributes,
   and the fact that one of them is dropped doesn't embarrass this
   function (apparently relnatts in pg_class is never decremented) --
   we just go ahead and decode only live attributes.

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/test_decoding/expected/oldest_xmin.out b/contrib/test_decoding/expected/oldest_xmin.out
index d09342c4be..d1b4f17e3a 100644
--- a/contrib/test_decoding/expected/oldest_xmin.out
+++ b/contrib/test_decoding/expected/oldest_xmin.out
@@ -1,6 +1,6 @@
 Parsed test spec with 2 sessions
 
-starting permutation: s0_begin s0_getxid s1_begin s1_insert s0_alter s0_commit s0_checkpoint s0_get_changes s1_commit
s0_vacuums0_get_changes
 
+starting permutation: s0_begin s0_getxid s1_begin s1_insert s0_alter s0_commit s0_checkpoint s0_get_changes
s0_get_changess1_commit s0_vacuum s0_get_changes
 
 step s0_begin: BEGIN;
 step s0_getxid: SELECT txid_current() IS NULL;
 ?column?       
@@ -14,8 +14,11 @@ step s0_checkpoint: CHECKPOINT;
 step s0_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0',
'skip-empty-xacts','1');
 
 data           
 
+step s0_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0',
'skip-empty-xacts','1');
 
+data           
+
 step s1_commit: COMMIT;
-step s0_vacuum: VACUUM FULL;
+step s0_vacuum: VACUUM pg_attribute;
 step s0_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0',
'skip-empty-xacts','1');
 
 data           
 
diff --git a/contrib/test_decoding/specs/oldest_xmin.spec b/contrib/test_decoding/specs/oldest_xmin.spec
index 4f8af70aa2..141fe2b145 100644
--- a/contrib/test_decoding/specs/oldest_xmin.spec
+++ b/contrib/test_decoding/specs/oldest_xmin.spec
@@ -22,7 +22,7 @@ step "s0_getxid" { SELECT txid_current() IS NULL; }
 step "s0_alter" { ALTER TYPE basket DROP ATTRIBUTE mangos; }
 step "s0_commit" { COMMIT; }
 step "s0_checkpoint" { CHECKPOINT; }
-step "s0_vacuum" { VACUUM FULL; }
+step "s0_vacuum" { VACUUM pg_attribute; }
 step "s0_get_changes" { SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids',
'0','skip-empty-xacts', '1'); }
 
 
 session "s1"
@@ -30,8 +30,11 @@ step "s1_begin" { BEGIN; }
 step "s1_insert" { INSERT INTO harvest VALUES ((1, 2, 3)); }
 step "s1_commit" { COMMIT; }
 
-# Checkpoint with following get_changes forces to advance xmin. ALTER of a
+# Checkpoint with following get_changes forces xmin advancement. We do
+# get_changes twice because if one more xl_running_xacts record had slipped
+# before our CHECKPOINT, xmin will be advanced only on this record, thus not
+# reaching value needed for vacuuming corresponding pg_attribute entry. ALTER of
 # composite type is a rare form of DDL which allows T1 to see the tuple which
 # will be removed (xmax set) before T1 commits. That is, interlocking doesn't
 # forbid modifying catalog after someone read it (and didn't commit yet).
-permutation "s0_begin" "s0_getxid" "s1_begin" "s1_insert" "s0_alter" "s0_commit" "s0_checkpoint" "s0_get_changes"
"s1_commit""s0_vacuum" "s0_get_changes"
 
+permutation "s0_begin" "s0_getxid" "s1_begin" "s1_insert" "s0_alter" "s0_commit" "s0_checkpoint" "s0_get_changes"
"s0_get_changes""s1_commit""s0_vacuum" "s0_get_changes" 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: branches_of_interest.txt
Next
From: Alvaro Hernandez
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)