Re: Proposal: Log inability to lock pages during vacuum - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Proposal: Log inability to lock pages during vacuum
Date
Msg-id CAM-w4HNpoj_qfPY+7juVrcFhR=Gbk3tpFcPc_5q8R-tdmbsinQ@mail.gmail.com
Whole thread Raw
In response to Proposal: Log inability to lock pages during vacuum  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses Re: Proposal: Log inability to lock pages during vacuum
List pgsql-hackers
On Mon, Oct 20, 2014 at 2:57 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> Currently, a non-freeze vacuum will punt on any page it can't get a cleanup
> lock on, with no retry. Presumably this should be a rare occurrence, but I
> think it's bad that we just assume that and won't warn the user if something
> bad is going on.
>
> My thought is that if we skip any pages elog(LOG) how many we skipped. If we
> skip more than 1% of the pages we visited (not relpages) then elog(WARNING)
> instead.

Is there some specific failure you've run into where a page was stuck
in a pinned state and never got vacuumed?

I would like to see a more systematic way of going about this. What
LSN or timestamp is associated with the oldest unvacuumed page? How
many times have we tried to visit it? What do those numbers look like
overall -- i.e. what's the median number of times it takes to vacuum a
page and what does the distribution look like of the unvacuumed ages?

With that data it should be possible to determine if the behaviour is
actually working well and where to draw the line to determine outliers
that might represent bugs.

-- 
greg



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: narwhal and PGDLLIMPORT
Next
From: "Brightwell, Adam"
Date:
Subject: Re: alter user/role CURRENT_USER