Re: BUG #17212: pg_amcheck fails on checking temporary relations - Mailing list pgsql-hackers

From Robert Haas
Subject Re: BUG #17212: pg_amcheck fails on checking temporary relations
Date
Msg-id CA+Tgmoat1M4MAqUQM1H+bkWR=wXmi33bB0YRR=xaFk9ADs5Jeg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #17212: pg_amcheck fails on checking temporary relations  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: BUG #17212: pg_amcheck fails on checking temporary relations
List pgsql-hackers
On Wed, Oct 6, 2021 at 3:56 PM Peter Geoghegan <pg@bowt.ie> wrote:
> I agree, with the stipulation that the caller (in this case
> pg_amcheck) is required to know certain basic things about the
> relation in order to get useful behavior. For example, if you use
> bt_index_check() with a GIN index, you're going to get an error. That
> much we can all agree on, I'm sure.

Yes.

> Where I might go further than you or Mark (not sure) is on this: I
> also think that it's the caller's job to not call the functions with
> temp relations, or (in the case of the index verification stuff) with
> !indisready or !indisvalid relations. I believe that these ought to
> also be treated as basic questions about the relation, just like in my
> GIN example. But that's as far as I go here.

I am on board with this, with slight trepidation.

> > > --parent-check does present us with the question of what to do in Hot
> > > Standby mode, where it will surely fail (because it requires a
> > > relation level ShareLock, etc). But I actually don't think it's
> > > complicated: we must throw an error, because it's fundamentally not
> > > something that will ever work (with any index). Whether the error
> > > comes from pg_amcheck or amcheck proper doesn't seem important to me.
> >
> > That detail, to me, is actually very important.
>
> I believe that you actually reached the same conclusion, though: we
> should let it just fail. That makes this question easy.

Great.

> > > I think it's pretty clear that verify_heapam.c (from amcheck proper)
> > > should just follow verify_nbtree.c when directly invoked against an
> > > unlogged index in Hot Standby. That is, it should assume that the
> > > relation has no storage, but still "verify" it conceptually. Just show
> > > a NOTICE about it. Assume no storage to verify.
> >
> > I haven't checked the code, but that sounds right. I interpret this to
> > mean that the different sub-parts of amcheck don't handle this case in
> > ways that are consistent with each other, and that seems wrong. We
> > should make them consistent.
>
> I agree that nbtree and heapam verification ought to agree here. But
> my point was just that this behavior just makes sense: what we have is
> something just like an empty relation.

I am not confident that this behavior is optimal. It's pretty
arbitrary. It's like saying "well, you asked me to check that everyone
in the car was wearing seatbelts, and the car has no seatbelts, so
we're good!"

To which I respond: maybe. Were we trying to verify that people are
complying with safety regulations as well as may be possible under the
circumstances, or that people are actually safe?

The analogy here is: are we trying to verify that the relations are
valid? Or are we just trying to verify that they are as valid as we
can expect them to be?

For me, the deciding point is that verify_nbtree.c was here first, and
it set a precedent. Unless there is a compelling reason to do
otherwise, we should make later things conform to that precedent.
Whether that's actually best, I'm not certain. It might be, but I'm
not sure that it is.

> > > Finally, there is the question of what happens inside pg_amcheck (not
> > > amcheck proper) deals with unlogged relations in Hot Standby mode.
> > > There are two reasonable options: it can either "verify" the indexes
> > > (actually just show those NOTICE messages), or skip them entirely. I
> > > lean towards the former option, on the grounds that I don't think it
> > > should be special-cased. But I don't feel very strongly about it.
> >
> > I like having it do this:
> >
> >         ereport(NOTICE,
> >                         (errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
> >                          errmsg("cannot verify unlogged index \"%s\"
> > during recovery, skipping",
> >                                         RelationGetRelationName(rel))));
> >
> > I think the fewer decisions the command-line tool makes, the better.
> > We should put the policy decisions in amcheck itself.
>
> Wait, so you're arguing that we should change amcheck (both nbtree and
> heapam verification) to simply reject unlogged indexes during
> recovery?

No, that's existing code from btree_index_mainfork_expected. I thought
you were saying that verify_heapam.c should adopt the same approach,
and I was agreeing, not because I think it's necessarily the perfect
approach, but for consistency.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: BUG #17212: pg_amcheck fails on checking temporary relations
Next
From: Robert Haas
Date:
Subject: Re: Role Self-Administration