Re: recovering from "found xmin ... from before relfrozenxid ..." - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: recovering from "found xmin ... from before relfrozenxid ..." |
Date | |
Msg-id | CA+fd4k6yQbjF=wxLShpBNF7pstsOdqCUXhSNv-v698LLYaYrEA@mail.gmail.com Whole thread Raw |
In response to | Re: recovering from "found xmin ... from before relfrozenxid ..." (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>) |
Responses |
Re: recovering from "found xmin ... from before relfrozenxid ..."
Re: recovering from "found xmin ... from before relfrozenxid ..." |
List | pgsql-hackers |
On Tue, 25 Aug 2020 at 17:08, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote: > > On Fri, 21 Aug 2020 at 22:25, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > Hi Masahiko-san, > > > > Please find the updated patch with the following new changes: > > > > Thank you for updating the patch! > > > 1) It adds the code changes in heap_force_kill function to clear an > > all-visible bit on the visibility map corresponding to the page that > > is marked all-visible. Along the way it also clears PD_ALL_VISIBLE > > flag on the page header. > > I think we need to clear all visibility map bits by using > VISIBILITYMAP_VALID_BITS. Otherwise, the page has all-frozen bit but > not all-visible bit, which is not a valid state. > > > > > 2) It adds the code changes in heap_force_freeze function to reset the > > ctid value in a tuple header if it is corrupted. > > > > 3) It adds several notes and examples in the documentation stating > > when and how we need to use the functions provided by this module. > > > > Please have a look and let me know for any other concern. > > > > And here are small comments on the heap_surgery.c: > > + /* > + * Get the offset numbers from the tids belonging to one particular page > + * and process them one by one. > + */ > + blkno = tids_same_page_fetch_offnums(tids, ntids, &next_start_ptr, > + offnos); > + > + /* Calculate the number of offsets stored in offnos array. */ > + noffs = next_start_ptr - curr_start_ptr; > + > + /* > + * Update the current start pointer so that next time when > + * tids_same_page_fetch_offnums() is called, we can calculate the number > + * of offsets present in the offnos array. > + */ > + curr_start_ptr = next_start_ptr; > + > + /* Check whether the block number is valid. */ > + if (blkno >= nblocks) > + { > + ereport(NOTICE, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("skipping block %u for relation \"%s\" > because the block number is out of range", > + blkno, RelationGetRelationName(rel)))); > + continue; > + } > + > + CHECK_FOR_INTERRUPTS(); > > I guess it would be better to call CHECK_FOR_INTERRUPTS() at the top > of the do loop for safety. I think it's unlikely to happen but the > user might mistakenly specify a lot of wrong block numbers. > > ---- > + offnos = (OffsetNumber *) palloc(ntids * sizeof(OffsetNumber)); > + noffs = curr_start_ptr = next_start_ptr = 0; > + nblocks = RelationGetNumberOfBlocks(rel); > + > + do > + { > > (snip) > > + > + /* > + * Get the offset numbers from the tids belonging to one particular page > + * and process them one by one. > + */ > + blkno = tids_same_page_fetch_offnums(tids, ntids, &next_start_ptr, > + offnos); > + > + /* Calculate the number of offsets stored in offnos array. */ > + noffs = next_start_ptr - curr_start_ptr; > + > > (snip) > > + /* No ereport(ERROR) from here until all the changes are logged. */ > + START_CRIT_SECTION(); > + > + for (i = 0; i < noffs; i++) > > You copy all offset numbers belonging to the same page to palloc'd > array, offnos, and iterate it while processing the tuples. I might be > missing something but I think we can do that without allocating the > space for offset numbers. Is there any reason for this? I guess we can > do that by just iterating the sorted tids array. > Let me share other comments on the latest version patch: Some words need to be tagged. For instance, I found the following words: VACUUM DISABLE_PAGE_SKIPPING HEAP_XMIN_FROZEN HEAP_XMAX_INVALID --- +test=# select ctid from t1 where xmin = 507; + ctid +------- + (0,3) +(1 row) + +test=# select heap_force_freeze('t1'::regclass, ARRAY['(0, 3)']::tid[]); +-[ RECORD 1 ]-----+- +heap_force_freeze | I think it's better to use a consistent output format. The former uses the normal format whereas the latter uses the expanded format. --- + <note> + <para> + While performing surgery on a damaged relation, we must not be doing anything + else on that relation in parallel. This is to ensure that when we are + operating on a damaged tuple there is no other transaction trying to modify + that tuple. + </para> + </note> If we prefer to avoid concurrent operations on the target relation why don't we use AccessExclusiveLock? --- +CREATE FUNCTION heap_force_kill(reloid regclass, tids tid[]) +RETURNS VOID +AS 'MODULE_PATHNAME', 'heap_force_kill' +LANGUAGE C STRICT; +CREATE FUNCTION heap_force_freeze(reloid regclass, tids tid[]) +RETURNS VOID +AS 'MODULE_PATHNAME', 'heap_force_freeze' +LANGUAGE C STRICT; I think these functions should be PARALLEL UNSAFE. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: