Re: recovering from "found xmin ... from before relfrozenxid ..." - Mailing list pgsql-hackers
From | Ashutosh Sharma |
---|---|
Subject | Re: recovering from "found xmin ... from before relfrozenxid ..." |
Date | |
Msg-id | CAE9k0Pk2TCTjGqr6yrX1tQNG66zJfdnSSveTqM2cW1R-Jcn30g@mail.gmail.com Whole thread Raw |
In response to | Re: recovering from "found xmin ... from before relfrozenxid ..." (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>) |
Responses |
Re: recovering from "found xmin ... from before relfrozenxid ..."
Re: recovering from "found xmin ... from before relfrozenxid ..." Re: recovering from "found xmin ... from before relfrozenxid ..." |
List | pgsql-hackers |
Hello Masahiko-san, Thanks for the review. Please check the comments inline below: On Fri, Aug 14, 2020 at 10:07 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote: > Thank you for updating the patch! Here are my comments on v5 patch: > > --- a/contrib/Makefile > +++ b/contrib/Makefile > @@ -35,6 +35,7 @@ SUBDIRS = \ > pg_standby \ > pg_stat_statements \ > pg_trgm \ > + pg_surgery \ > pgcrypto \ > > I guess we use alphabetical order here. So pg_surgery should be placed > before pg_trgm. > Okay, will take care of this in the next version of patch. > --- > + if (heap_force_opt == HEAP_FORCE_KILL) > + ItemIdSetDead(itemid); > > I think that if the page is an all-visible page, we should clear an > all-visible bit on the visibility map corresponding to the page and > PD_ALL_VISIBLE on the page header. Otherwise, index only scan would > return the wrong results. > I think we should let VACUUM do that. Please note that this module is intended to be used only on a damaged relation and should only be operated on damaged tuples of such relations. And the execution of any of the functions provided by this module on a damaged relation must be followed by VACUUM with DISABLE_PAGE_SKIPPING option on that relation. This is necessary to bring back a damaged relation to the sane state once a surgery is performed on it. I will try to add this note in the documentation for this module. > --- > + /* > + * We do not mark the buffer dirty or do WAL logging for unmodifed > + * pages. > + */ > + if (!did_modify_page) > + goto skip_wal; > + > + /* Mark buffer dirty before we write WAL. */ > + MarkBufferDirty(buf); > + > + /* XLOG stuff */ > + if (RelationNeedsWAL(rel)) > + log_newpage_buffer(buf, true); > + > +skip_wal: > + END_CRIT_SECTION(); > + > > s/unmodifed/unmodified/ > okay, will fix this typo. > Do we really need to use goto? I think we can modify it like follows: > > if (did_modity_page) > { > /* Mark buffer dirty before we write WAL. */ > MarkBufferDirty(buf); > > /* XLOG stuff */ > if (RelationNeedsWAL(rel)) > log_newpage_buffer(buf, true); > } > > END_CRIT_SECTION(); > No, we don't need it. We can achieve the same by checking the status of did_modify_page flag as you suggested. I will do this change in the next version. > --- > pg_force_freeze() can revival a tuple that is already deleted but not > vacuumed yet. Therefore, the user might need to reindex indexes after > using that function. For instance, with the following script, the last > two queries: index scan and seq scan, will return different results. > > set enable_seqscan to off; > set enable_bitmapscan to off; > set enable_indexonlyscan to off; > create table tbl (a int primary key); > insert into tbl values (1); > > update tbl set a = a + 100 where a = 1; > > explain analyze select * from tbl where a < 200; > > -- revive deleted tuple on heap > select heap_force_freeze('tbl', array['(0,1)'::tid]); > > -- index scan returns 2 tuples > explain analyze select * from tbl where a < 200; > > -- seq scan returns 1 tuple > set enable_seqscan to on; > explain analyze select * from tbl; > I am not sure if this is the right use-case of pg_force_freeze function. I think we should only be running pg_force_freeze function on a tuple for which VACUUM reports "found xmin ABC from before relfrozenxid PQR" sort of error otherwise it might worsen the things instead of making it better. Now, the question is - can VACUUM report this type of error for a deleted tuple or it would only report it for a live tuple? AFAIU this won't be reported for the deleted tuples because VACUUM wouldn't consider freezing a tuple that has been deleted. > Also, if a tuple updated and moved to another partition is revived by > heap_force_freeze(), its ctid still has special values: > MovedPartitionsOffsetNumber and MovedPartitionsBlockNumber. I don't > see a problem yet caused by a visible tuple having the special ctid > value, but it might be worth considering either to reset ctid value as > well or to not freezing already-deleted tuple. > For this as well, the answer remains the same as above. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com
pgsql-hackers by date: