Thread: error_severity of brin work item

error_severity of brin work item

From
Justin Pryzby
Date:
I saw a bunch of these in my logs:

log_time | 2020-10-25 22:59:45.619-07
database | 
left     | could not open relation with OID 292103095
left     | processing work entry for relation "ts.child.alarms_202010_alarm_clear_time_idx"

Those happen following a REINDEX job on that index.

I think that should be more like an INFO message, since that's what vacuum does
(vacuum_open_relation), and a queued work item is even more likely to hit a
dropped relation.  It's easy to hit by setting autovacuum_naptime=1 and looping
around REINDEX CONCURRENTLY while updating a table.

Autovacuum is queueing work items for later:

src/backend/postmaster/autovacuum.c-            switch (workitem->avw_type)
src/backend/postmaster/autovacuum.c-            {
src/backend/postmaster/autovacuum.c-                    case AVW_BRINSummarizeRange:
src/backend/postmaster/autovacuum.c:                            DirectFunctionCall2(brin_summarize_range,
src/backend/postmaster/autovacuum.c-
ObjectIdGetDatum(workitem->avw_relation),
src/backend/postmaster/autovacuum.c-
Int64GetDatum((int64)workitem->avw_blockNumber));
 
src/backend/postmaster/autovacuum.c-                            break;

And if the index is missing:

brin_summarize_range(PG_FUNCTION_ARGS)
|        /*
|         * We must lock table before index to avoid deadlocks.  However, if the
|         * passed indexoid isn't an index then IndexGetRelation() will fail.
|         * Rather than emitting a not-very-helpful error message, postpone
|         * complaining, expecting that the is-it-an-index test below will fail.
|         */
|        heapoid = IndexGetRelation(indexoid, true);
|        if (OidIsValid(heapoid))
|                heapRel = table_open(heapoid, ShareUpdateExclusiveLock);
|        else
|                heapRel = NULL;
|
|        indexRel = index_open(indexoid, ShareUpdateExclusiveLock);

table_open is succcessful and then index_open fails.  (I thought locks would
have prevented that ?)

Justin



Re: error_severity of brin work item

From
Alvaro Herrera
Date:
On 2020-Nov-13, Justin Pryzby wrote:

> I saw a bunch of these in my logs:
> 
> log_time | 2020-10-25 22:59:45.619-07
> database | 
> left     | could not open relation with OID 292103095
> left     | processing work entry for relation "ts.child.alarms_202010_alarm_clear_time_idx"
> 
> Those happen following a REINDEX job on that index.
> 
> I think that should be more like an INFO message, since that's what vacuum does
> (vacuum_open_relation), and a queued work item is even more likely to hit a
> dropped relation.

Ah, interesting.  Yeah, I agree this is a bug.  I think it can be fixed
by using try_relation_open() on the index; if that returns NULL, discard
the work item.

Does this patch solve the problem?


Attachment

Re: error_severity of brin work item

From
Justin Pryzby
Date:
On Fri, Nov 13, 2020 at 01:39:31PM -0300, Alvaro Herrera wrote:
> On 2020-Nov-13, Justin Pryzby wrote:
> 
> > I saw a bunch of these in my logs:
> > 
> > log_time | 2020-10-25 22:59:45.619-07
> > database | 
> > left     | could not open relation with OID 292103095
> > left     | processing work entry for relation "ts.child.alarms_202010_alarm_clear_time_idx"
> > 
> > Those happen following a REINDEX job on that index.
> > 
> > I think that should be more like an INFO message, since that's what vacuum does
> > (vacuum_open_relation), and a queued work item is even more likely to hit a
> > dropped relation.
> 
> Ah, interesting.  Yeah, I agree this is a bug.  I think it can be fixed
> by using try_relation_open() on the index; if that returns NULL, discard
> the work item.
> 
> Does this patch solve the problem?

Your patch didn't actually say "try_relation_open", so didn't work.
But it does works if I do that, and close the table.

I tested like:

pryzbyj=# ALTER SYSTEM SET backtrace_functions='try_relation_open,relation_open';
pryzbyj=# ALTER SYSTEM SET autovacuum_naptime=3; SELECT pg_reload_conf();
pryzbyj=# CREATE TABLE tt AS SELECT generate_series(1,9999)i;
pryzbyj=# CREATE INDEX ON tt USING brin(i) WITH(autosummarize,pages_per_range=1);
pryzbyj=# \! while :; do psql -h /tmp -qc 'SET client_min_messages=info' -c 'REINDEX INDEX CONCURRENTLY tt_i_idx';
done&

-- run this 5-10 times and hit the "...was not recorded" message, which for
-- whatever reason causes the race condition involving work queue
pryzbyj=# UPDATE tt SET i=1+i;

2020-11-13 11:50:46.093 CST [30687] ERROR:  could not open relation with OID 1110882
2020-11-13 11:50:46.093 CST [30687] CONTEXT:  processing work entry for relation "pryzbyj.public.tt_i_idx"
2020-11-13 11:50:46.093 CST [30687] BACKTRACE:
        postgres: autovacuum worker pryzbyj(+0xb9ce8) [0x55acf2af0ce8]
        postgres: autovacuum worker pryzbyj(index_open+0xb) [0x55acf2bab59b]
        postgres: autovacuum worker pryzbyj(brin_summarize_range+0x8f) [0x55acf2b5b5bf]
        postgres: autovacuum worker pryzbyj(DirectFunctionCall2Coll+0x62) [0x55acf2f40372]
        ...

-- 
Justin

Attachment

Re: error_severity of brin work item

From
Justin Pryzby
Date:
On Fri, Nov 13, 2020 at 12:11:21PM -0600, Justin Pryzby wrote:
> On Fri, Nov 13, 2020 at 01:39:31PM -0300, Alvaro Herrera wrote:
> > On 2020-Nov-13, Justin Pryzby wrote:
> > 
> > > I saw a bunch of these in my logs:
> > > 
> > > log_time | 2020-10-25 22:59:45.619-07
> > > database | 
> > > left     | could not open relation with OID 292103095
> > > left     | processing work entry for relation "ts.child.alarms_202010_alarm_clear_time_idx"
> > > 
> > > Those happen following a REINDEX job on that index.
> > > 
> > > I think that should be more like an INFO message, since that's what vacuum does
> > > (vacuum_open_relation), and a queued work item is even more likely to hit a
> > > dropped relation.
> > 
> > Ah, interesting.  Yeah, I agree this is a bug.  I think it can be fixed
> > by using try_relation_open() on the index; if that returns NULL, discard
> > the work item.
> > 
> > Does this patch solve the problem?
> 
> Your patch didn't actually say "try_relation_open", so didn't work.
> But it does works if I do that, and close the table.

That patch broke the case that a non-index is passed, which I addressed here.

I wondered if the function should return NULL in those cases, but it seems to
be "impossible".

Attachment

Re: error_severity of brin work item

From
Alvaro Herrera
Date:
On 2020-Nov-19, Justin Pryzby wrote:

> On Fri, Nov 13, 2020 at 12:11:21PM -0600, Justin Pryzby wrote:

> > Your patch didn't actually say "try_relation_open", so didn't work.
> > But it does works if I do that, and close the table.

Thanks for fixing and testing.

> That patch broke the case that a non-index is passed, which I addressed here.

Hmm, I think the reaction to that should be the same as before, so
rather than return 0, the patch should raise the same error that
index_open() would.




Re: error_severity of brin work item

From
Justin Pryzby
Date:
On Thu, Nov 19, 2020 at 03:15:21PM -0300, Alvaro Herrera wrote:
> On 2020-Nov-19, Justin Pryzby wrote:
> 
> > On Fri, Nov 13, 2020 at 12:11:21PM -0600, Justin Pryzby wrote:
> 
> > > Your patch didn't actually say "try_relation_open", so didn't work.
> > > But it does works if I do that, and close the table.
> 
> Thanks for fixing and testing.
> 
> > That patch broke the case that a non-index is passed, which I addressed here.
> 
> Hmm, I think the reaction to that should be the same as before, so
> rather than return 0, the patch should raise the same error that
> index_open() would.

The resulting logic is not very clear and requires a lot of commentary..

BTW I saw that in tablecmds.c, RangeVarCallbackForAttachIndex() does this:

        if (classform->relkind != RELKIND_PARTITIONED_INDEX &&
                classform->relkind != RELKIND_INDEX)
                ereport(ERROR,
                                (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
                                 errmsg("\"%s\" is not an index", rv->relname)));

Is it wrong to use ERRCODE_INVALID_OBJECT_DEFINITION ?
Most other places say ERRCODE_WRONG_OBJECT_TYPE

Likewise, transformPartitionCmd() in parse_utilcmd.c:
        (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
         errmsg("\"%s\" is not a partitioned table",
                RelationGetRelationName(parentRel))));

-- 
Justin

Attachment

Re: error_severity of brin work item

From
Alvaro Herrera
Date:
I think this formulation (attached v3) has fewer moving parts.

However, now that I did that, I wonder if this is really the best
approach to solve this problem.  Maybe instead of doing this at the BRIN
level, it should be handled at the autovac level, by having the worker
copy the work-item to local memory and remove it from the shared list as
soon as it is in progress.  That way, if *any* error occurs while trying
to execute it, it will go away instead of being retried for all
eternity.

Preliminary patch for that attached as autovacuum-workitem.patch.

I would propose to clean that up to apply instead of your proposed fix.

Attachment

Re: error_severity of brin work item

From
Justin Pryzby
Date:
On Mon, Nov 23, 2020 at 04:39:57PM -0300, Alvaro Herrera wrote:
> I think this formulation (attached v3) has fewer moving parts.
> 
> However, now that I did that, I wonder if this is really the best
> approach to solve this problem.  Maybe instead of doing this at the BRIN
> level, it should be handled at the autovac level, by having the worker
> copy the work-item to local memory and remove it from the shared list as
> soon as it is in progress.  That way, if *any* error occurs while trying
> to execute it, it will go away instead of being retried for all
> eternity.
> 
> Preliminary patch for that attached as autovacuum-workitem.patch.
> 
> I would propose to clean that up to apply instead of your proposed fix.

I don't know why you said it would be retried for eternity ?
I think the perform_work_item() TRY/CATCH avoids that.

Also .. I think your idea doesn't solve my issue, that REINDEX CONCURRENTLY is
causing vacuum to leave errors in my logs.

I check that the first patch avoids the issue and the 2nd one does not.

-- 
Justin



Re: error_severity of brin work item

From
Alvaro Herrera
Date:
The more I look at this, the less I like it.  This would set a precedent
that any action that can be initiated from an autovac work-item has a
requirement of silently being discarded when it referenced a
non-existant relation.

I'd rather have the code that drops the index go through the list of
work-items and delete those that reference that relation.

I'm not sure if this is something that ought to be done in index_drop();
One objection might be that if the drop is rolled back, the work-items
are lost.  It's the easiest, though; and work-items are supposed to be
lossy anyway, and vacuum would fix the lack of summarization eventually.
So, not pretty, but not all that bad.  (Hopefully rolled-back drops are
not all that common.)



Re: error_severity of brin work item

From
Justin Pryzby
Date:
On Mon, Nov 30, 2020 at 08:47:32PM -0300, Alvaro Herrera wrote:
> The more I look at this, the less I like it.  This would set a precedent
> that any action that can be initiated from an autovac work-item has a
> requirement of silently being discarded when it referenced a
> non-existant relation.

My original request was to change to INFO, which is what vacuum proper does at
vacuum_open_relation().  I realize that still means that new work item handlers
would have to do the LockOid, try_open dance - maybe it could be factored out.

I noticed this on multiple servers immediate after changing a nagios script to
look for all ERRORs (rather than a specific list of log_messages) and for a
longer time period.

> I'd rather have the code that drops the index go through the list of
> work-items and delete those that reference that relation.
> 
> I'm not sure if this is something that ought to be done in index_drop();
> One objection might be that if the drop is rolled back, the work-items
> are lost.

Should it be done in an AtCommit hook or something like that ?

-- 
Justin



Re: error_severity of brin work item

From
Alvaro Herrera
Date:
On 2020-Nov-30, Justin Pryzby wrote:

> On Mon, Nov 30, 2020 at 08:47:32PM -0300, Alvaro Herrera wrote:
> > The more I look at this, the less I like it.  This would set a precedent
> > that any action that can be initiated from an autovac work-item has a
> > requirement of silently being discarded when it referenced a
> > non-existant relation.
> 
> My original request was to change to INFO, which is what vacuum proper does at
> vacuum_open_relation().  I realize that still means that new work item handlers
> would have to do the LockOid, try_open dance - maybe it could be factored out.

As I understand, INFO is not well suited to messages that are not going
to the client.  Anyway, my point is about the contortions that are
needed to support the case, rather than what exactly do we do when it
happens.

Retrospectively, it's strange that this problem (what happens when
indexes with pending work-items are dropped) hadn't manifested.  It
seems a pretty obvious one.

> > I'd rather have the code that drops the index go through the list of
> > work-items and delete those that reference that relation.
> > 
> > I'm not sure if this is something that ought to be done in index_drop();
> > One objection might be that if the drop is rolled back, the work-items
> > are lost.
> 
> Should it be done in an AtCommit hook or something like that ?

I didn't like this idea much on first read, on extensibility grounds,
but perhaps it's not so bad because we can generalize it whenever
there's pressure to add a second type of work-item (*if* that ever
happens).

I suppose the idea is that index_drop saves the index OID when a BRIN
index with autosummarization=on is dropped, and then the
AtCommit_WorkItems() call scans the items list and drops those that
match any OIDs in the list.  (It does require to be mindful of subxact
aborts, of course.)



Re: error_severity of brin work item

From
Justin Pryzby
Date:
On Tue, Dec 01, 2020 at 11:07:30AM -0300, Alvaro Herrera wrote:
> > Should it be done in an AtCommit hook or something like that ?
> 
> I didn't like this idea much on first read, on extensibility grounds,
> but perhaps it's not so bad because we can generalize it whenever
> there's pressure to add a second type of work-item (*if* that ever
> happens).
> 
> I suppose the idea is that index_drop saves the index OID when a BRIN
> index with autosummarization=on is dropped, and then the
> AtCommit_WorkItems() call scans the items list and drops those that
> match any OIDs in the list.  (It does require to be mindful of subxact
> aborts, of course.)

This was an idea I made up - I don't know any of the details of this, but if
you give a hint I could look at it more.  There'd (still) be a race window, but
I think that's ok.

Another idea is if perform_work_item() were responsible for discarding
relations which disappear.  Currently it does this, which is racy since it
holds no lock.

        cur_relname = get_rel_name(workitem->avw_relation);
        cur_nspname = get_namespace_name(get_rel_namespace(workitem->avw_relation));
        cur_datname = get_database_name(MyDatabaseId);
        if (!cur_relname || !cur_nspname || !cur_datname)
                goto deleted2;

-- 
Justin



Re: error_severity of brin work item

From
Alvaro Herrera
Date:
On 2020-Dec-01, Justin Pryzby wrote:

> This was an idea I made up - I don't know any of the details of this, but if
> you give a hint I could look at it more.  There'd (still) be a race window, but
> I think that's ok.

See CommitTransaction() and friends, where AtEOXact_on_commit_actions()
and others are called.  You'd have to create a new routine (say
AtEOXact_Autovacuum or more specific AtEOXact_AutovacuumWorkItems), to
be called at the right places in xact.c.  Keep a global variable, say a
list of OIDs.  On subxact commit, the list is reassigned to its parent
transaction; on subxact abort, the list is discarded.  On top xact
commit, the list of OIDs is passed to some new routine in autovacuum.c
that scans the workitem array and deletes items as appropriate.

Not sure what's a good place for OIDs to be added to the list.  We don't
have AM-specific entry points for relation drop.  I think this is the
weakest point of this.


> Another idea is if perform_work_item() were responsible for discarding
> relations which disappear.  Currently it does this, which is racy since it
> holds no lock.

That has the property that it remains contained in autovacuum.c, but no
other advantages I think.



Re: error_severity of brin work item

From
Justin Pryzby
Date:
On Tue, Dec 01, 2020 at 03:57:24PM -0300, Alvaro Herrera wrote:
> On 2020-Dec-01, Justin Pryzby wrote:
> 
> > This was an idea I made up - I don't know any of the details of this, but if
> > you give a hint I could look at it more.  There'd (still) be a race window, but
> > I think that's ok.
> 
> See CommitTransaction() and friends, where AtEOXact_on_commit_actions()

..thanks.  I was going to persue the other idea first since this is new to me.

> Not sure what's a good place for OIDs to be added to the list.  We don't
> have AM-specific entry points for relation drop.  I think this is the
> weakest point of this.

I assume it would just add OIDs of *all* dropped rels, and the autovacuum half
would silently ignore any OID for which there's no work item.  (As it would do
in any case).

> > Another idea is if perform_work_item() were responsible for discarding
> > relations which disappear.  Currently it does this, which is racy since it
> > holds no lock.
> 
> That has the property that it remains contained in autovacuum.c, but no
> other advantages I think.

It has the advantage that it moves all the try_open stuff out of brin.

I started implementing this, and then realized that the try_open stuff *has* to
be in the brin_summarize function, to handle the case that someone passes a
non-index, since it's SQL exposed.
So maybe we should use your LockOid patch now, and refactor in the future if we
add additional work-item types.

-- 
Justin



Re: error_severity of brin work item

From
Alvaro Herrera
Date:
On 2021-Sep-24, Jaime Casanova wrote:

> Do you plan to work on this for this CF?

I think the most reasonable action is to push the patch in
https://postgr.es/m/20201123193957.GA21810@alvherre.pgsql to all
branches, closing the immediate hole, and we can see about the xact-hook
stuff (when we have it) to master only, which I view as a better
mechanism to protect work-items going forward.  That's what I understand
Justin was suggesting in his last reply.

-- 
Álvaro Herrera           39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"La fuerza no está en los medios físicos
sino que reside en una voluntad indomable" (Gandhi)



Re: error_severity of brin work item

From
Justin Pryzby
Date:
On Fri, Sep 24, 2021 at 10:42:36AM -0300, Alvaro Herrera wrote:
> I think the most reasonable action is to push the patch in
> https://postgr.es/m/20201123193957.GA21810@alvherre.pgsql to all
> branches, closing the immediate hole, and we can see about the xact-hook
> stuff (when we have it) to master only, which I view as a better
> mechanism to protect work-items going forward.  That's what I understand
> Justin was suggesting in his last reply.

I suggested to use the 20 line solution rather than invent new infrastructure
just to avoid corner case errors.

But I also don't understand why changing to use an INFO message doesn't work.
vacuum_open_relation() outputs a WARNING for manual vacuum and an INFO for
autovacuum when log_min_duration >= 0.  Why couldn't this do the same thing?

I found a few instances of this in my logs, indicating that my understanding of
the logic is not wrong.

log_time               | 2021-01-13 15:23:40.687-05
session_line           | 2
session_start_time     | 2021-01-13 15:23:19-05
error_severity         | LOG
sql_state_code         | 42P01
message                | skipping vacuum of "pg_toast_984781820" --- relation no longer exists
backend_type           | autovacuum worker

-- 
Justin



Re: error_severity of brin work item

From
Daniel Gustafsson
Date:
This thread has stalled, and from the discussions there isn't consencus on how
to proceed (if at all?), so I'm marking this Returned with Feedback.  If the
discussion and work picks up again then it can be resubmitted.

--
Daniel Gustafsson        https://vmware.com/