Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple - Mailing list pgsql-committers

From Andres Freund
Subject Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple
Date
Msg-id 20171207202243.pmxwjm5wvz5lp6j6@alap3.anarazel.de
Whole thread Raw
In response to Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-committers
Hi,

On 2017-12-06 17:23:55 -0300, Alvaro Herrera wrote:
> > I've played around quite some with the attached patch. So far, after
> > applying the second patch, neither VACUUM nor VACUUM FULL / CLUSTER make
> > the situation worse for already existing corruption. HOT pruning can
> > change the exact appearance of existing corruption a bit, but I don't
> > think it can make the corruption meaningfully worse.  It's a bit
> > annoying and scary to add so many checks to backbranches but it kinda
> > seems required.  The error message texts aren't perfect, but these are
> > "should never be hit" type elog()s so I'm not too worried about that.
> 
> Looking at 0002: I agree with the stuff being done here.  I think a
> couple of these checks could be moved one block outerwards in term of
> scope; I don't see any reason why the check should not apply in that
> case.  I didn't catch any place missing additional checks.

I think I largely put them into the inner blocks because they were
guaranteed to be reached in those case (the horizon has to be before the
cutoff etc), and that way additional branches are avoided.


> Despite these being "shouldn't happen" conditions, I think we should
> turn these up all the way to ereports with an errcode and all, and also
> report the XIDs being complained about.  No translation required,
> though.  Other than those changes and minor copy editing a commit
> (attached), 0002 looks good to me.

Hm, I don't really care one way or another. I do see that you used
errmsg() in some places, errmsg_internal() in others. Was that
intentional?

> I started thinking it'd be good to report block number whenever anything
> happened while scanning the relation.  The best way to go about this
> seems to be to add an errcontext callback to lazy_scan_heap, so I attach
> a WIP untested patch to add that.  (I'm not proposing this for
> back-patch for now, mostly because I don't have the time/energy to push
> for it right now.)

That seems like a good idea. There's some cases where that could
increase log spam noticeably (unitialized blocks), but that seems
acceptable.


> +static void
> +lazy_scan_heap_cb(void *arg)
> +{
> +    LazyScanHeapInfo *info = (LazyScanHeapInfo *) arg;
> +
> +    if (info->blkno != InvalidBlockNumber)
> +        errcontext("while scanning page %u of relation %s",
> +                   info->blkno, RelationGetRelationName(info->relation));
> +    else
> +        errcontext("while vacuuming relation %s",
> +                   RelationGetRelationName(info->relation));
> +}

Hm, perhaps rephrase so both messages refer to vacuuming? E.g. just by
replacing scanning with vacuuming?

Greetings,

Andres Freund


pgsql-committers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple
Next
From: Andres Freund
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updatedtuple