Thread: Ideas for a relcache test mode about missing invalidations
Hi, The issue at [1] is caused by missing invalidations, and [2] seems like a likely candidate too. I wonder if it'd be good to have a relcache test mode akin to CLOBBER_CACHE_ALWAYS and RELCACHE_FORCE_RELEASE, that tries to ensure that we've done sufficiently to ensure the right invalidations are sent. I think what we'd kind of want is to ensure that relcache entries are rebuilt at the earliest possible time, but *not* later. That'd mean they're out of date if there's missing invalidations. Unfortunately I'm not clear on how that'd be achievable? Ideas? The best I can come up with is to code some additional dependencies into CacheInvalidateHeapTuple(), and add tracking ensuring we've sent the right messages. But that seems somewhat painful and filled with holes. [1] http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com [2] https://www.postgresql.org/message-id/12259.1532117714@sss.pgh.pa.us Greetings, Andres Freund
Hello. At Wed, 1 Aug 2018 09:25:18 -0700, Andres Freund <andres@anarazel.de> wrote in <20180801162518.jnb2ql5dfmgwp4qo@alap3.anarazel.de> > Hi, > > The issue at [1] is caused by missing invalidations, and [2] seems like > a likely candidate too. I wonder if it'd be good to have a relcache test > mode akin to CLOBBER_CACHE_ALWAYS and RELCACHE_FORCE_RELEASE, that tries > to ensure that we've done sufficiently to ensure the right invalidations > are sent. > > I think what we'd kind of want is to ensure that relcache entries are > rebuilt at the earliest possible time, but *not* later. That'd mean > they're out of date if there's missing invalidations. Unfortunately I'm > not clear on how that'd be achievable? Ideas? > > The best I can come up with is to code some additional dependencies into > CacheInvalidateHeapTuple(), and add tracking ensuring we've sent the > right messages. But that seems somewhat painful and filled with holes. > > [1] http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com > [2] https://www.postgresql.org/message-id/12259.1532117714@sss.pgh.pa.us As for [1], it is not a issue on invalidation. It happens also if the relation has any index and even drop is not needed. The following steps are sufficient. create table t( pk serial, t text ); insert into t( t ) values( 'hello' ), ('world'); create index idx_fake0 on t (pk); create index idx_fake on t ( f_fake( pk ) ); -- ERROR index_create() creates a new pg_index entry with indislive = true before building it. So the planner (RelationGetIndexList) sees the not-yet-populated index while planning the query in f_fake(). The attached patch fixes the issue, but I'm not sure this story is applicable to [2]. regards. -- Kyotaro Horiguchi NTT Open Source Software Center From eaa5de68ff27bd43a643089f8c5963e5cc3d20cc Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp> Date: Thu, 2 Aug 2018 18:52:24 +0900 Subject: [PATCH] Don't use currently-building index to populate it index_create() creates a pg_index entry with indislive = true before populating it. If it is a function index where the function runs a query on the parent relation, planner sees the just created entry and tries to access the heap page that is not created yet. This patch let index_create() not to set indislive = true after population. --- src/backend/catalog/index.c | 55 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 54 insertions(+), 1 deletion(-) diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 8b276bc430..b561c8696d 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -124,6 +124,7 @@ static void UpdateIndexRelation(Oid indexoid, Oid heapoid, bool immediate, bool isvalid, bool isready); +static void ActivateIndexRelation(Oid indexoid); static void index_update_stats(Relation rel, bool hasindex, double reltuples); @@ -666,7 +667,7 @@ UpdateIndexRelation(Oid indexoid, values[Anum_pg_index_indisvalid - 1] = BoolGetDatum(isvalid); values[Anum_pg_index_indcheckxmin - 1] = BoolGetDatum(false); values[Anum_pg_index_indisready - 1] = BoolGetDatum(isready); - values[Anum_pg_index_indislive - 1] = BoolGetDatum(true); + values[Anum_pg_index_indislive - 1] = BoolGetDatum(false); values[Anum_pg_index_indisreplident - 1] = BoolGetDatum(false); values[Anum_pg_index_indkey - 1] = PointerGetDatum(indkey); values[Anum_pg_index_indcollation - 1] = PointerGetDatum(indcollation); @@ -693,6 +694,55 @@ UpdateIndexRelation(Oid indexoid, heap_freetuple(tuple); } +/* ---------------------------------------------------------------- + * ActivateIndexRelation + * + * Publish index by marking it "relislive" + + * UpdateIndexRelation builds an index relation with relislive = false so as + * not to be used by the quieries used to build the index. This function + * marks the index as "live" so that it can be used hereafter. + * ---------------------------------------------------------------- + */ +static void +ActivateIndexRelation(Oid indexoid) +{ + Relation indrel; + SysScanDesc sdesc; + ScanKeyData skey; + HeapTuple tup; + + ScanKeyInit(&skey, + Anum_pg_index_indexrelid, + BTEqualStrategyNumber, F_OIDEQ, + ObjectIdGetDatum(indexoid)); + indrel = heap_open(IndexRelationId, RowExclusiveLock); + sdesc = systable_beginscan(indrel, InvalidOid, true, NULL, 1, &skey); + + /* + * We must see one and only one entry for the key. But don't bother + * checking that. + */ + while (HeapTupleIsValid(tup = systable_getnext(sdesc))) + { + Datum values[Natts_pg_index]; + bool nulls[Natts_pg_index]; + bool replaces[Natts_pg_index]; + + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls)); + MemSet(replaces, 0, sizeof(replaces)); + values[Anum_pg_index_indislive] = BoolGetDatum(true); + replaces[Anum_pg_index_indislive] = true; + tup = heap_modify_tuple(tup, RelationGetDescr(indrel), + values, nulls, replaces); + CatalogTupleUpdate(indrel, &tup->t_self, tup); + heap_freetuple(tup); + } + systable_endscan(sdesc); + heap_close(indrel, RowExclusiveLock); +} + /* * index_create @@ -1188,6 +1238,9 @@ index_create(Relation heapRelation, true); } + /* let queries use this index */ + ActivateIndexRelation(indexRelationId); + /* * Close the index; but we keep the lock that we acquired above until end * of transaction. Closing the heap is caller's responsibility. -- 2.16.3
Hi, On 2018-08-02 19:18:11 +0900, Kyotaro HORIGUCHI wrote: > At Wed, 1 Aug 2018 09:25:18 -0700, Andres Freund <andres@anarazel.de> wrote in <20180801162518.jnb2ql5dfmgwp4qo@alap3.anarazel.de> > > Hi, > > > > The issue at [1] is caused by missing invalidations, and [2] seems like > > a likely candidate too. I wonder if it'd be good to have a relcache test > > mode akin to CLOBBER_CACHE_ALWAYS and RELCACHE_FORCE_RELEASE, that tries > > to ensure that we've done sufficiently to ensure the right invalidations > > are sent. > > > > I think what we'd kind of want is to ensure that relcache entries are > > rebuilt at the earliest possible time, but *not* later. That'd mean > > they're out of date if there's missing invalidations. Unfortunately I'm > > not clear on how that'd be achievable? Ideas? > > > > The best I can come up with is to code some additional dependencies into > > CacheInvalidateHeapTuple(), and add tracking ensuring we've sent the > > right messages. But that seems somewhat painful and filled with holes. > > > > [1] http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com > > [2] https://www.postgresql.org/message-id/12259.1532117714@sss.pgh.pa.us > > As for [1], it is not a issue on invalidation. It happens also if > the relation has any index and even drop is not needed. The > following steps are sufficient. Huh? I don't think this is a proper fix. But please let's argue over in the other that in the other thread. Greetings, Andres Freund
On Thu, Aug 2, 2018 at 3:18 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: >> [1] http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com >> [2] https://www.postgresql.org/message-id/12259.1532117714@sss.pgh.pa.us > > As for [1], it is not a issue on invalidation. It happens also if > the relation has any index and even drop is not needed. The > following steps are sufficient. > > create table t( pk serial, t text ); > insert into t( t ) values( 'hello' ), ('world'); > create index idx_fake0 on t (pk); > create index idx_fake on t ( f_fake( pk ) ); -- ERROR The fact that there was a weird error wasn't really what we cared about there. If the user is doing something that's clearly unreasonable or nonsensical, then they cannot expect much from the error message (maybe we could do better, but it's not a priority). What we really cared about was the fact that it was possible to make a backend's relcache irrecoverably corrupt. That should never be allowed to happen, even when the user is determined to do something unreasonable. -- Peter Geoghegan
At Thu, 2 Aug 2018 14:40:50 -0700, Peter Geoghegan <pg@bowt.ie> wrote in <CAH2-WznJ8K1HXLTKOY53QBw2dtY0z53xKB-0RpqsJQBS0TkrdQ@mail.gmail.com> pg> On Thu, Aug 2, 2018 at 3:18 AM, Kyotaro HORIGUCHI pg> <horiguchi.kyotaro@lab.ntt.co.jp> wrote: pg> >> [1] http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com pg> >> [2] https://www.postgresql.org/message-id/12259.1532117714@sss.pgh.pa.us pg> > pg> > As for [1], it is not a issue on invalidation. It happens also if pg> > the relation has any index and even drop is not needed. The pg> > following steps are sufficient. pg> > pg> > create table t( pk serial, t text ); pg> > insert into t( t ) values( 'hello' ), ('world'); pg> > create index idx_fake0 on t (pk); pg> > create index idx_fake on t ( f_fake( pk ) ); -- ERROR pg> pg> The fact that there was a weird error wasn't really what we cared pg> about there. If the user is doing something that's clearly pg> unreasonable or nonsensical, then they cannot expect much from the pg> error message (maybe we could do better, but it's not a priority). Hmm. I don't think it's unreasonable or nonsensical, but I don't argue it here. pg> What we really cared about was the fact that it was possible to make a pg> backend's relcache irrecoverably corrupt. That should never be allowed pg> to happen, even when the user is determined to do something pg> unreasonable. I reread through the thread and IUCC, drop_index() is sending inval on the owing relation and invalidation happens (that is, RelationCacheInvalidateEntry is called for the owner relation.). Relcache for the owner is actually being rebuit during the following create index. At least the problem mentioned in [1] is happening using a fresh relcache created after invoking the index creation. The cause is RelationGetIndexList collects all "indislive" indexes, including the idx_fake before population, which can be fixed by the attached patch.. I'm I still misunderstanding something? # Anyway, I'll make another thread for the another issue. [1]: https://www.postgresql.org/message-id/20180628150209.n2qch5jtn3vt2xaa%40alap3.anarazel.de regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Hello. At Fri, 03 Aug 2018 10:54:46 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20180803.105446.155966888.horiguchi.kyotaro@lab.ntt.co.jp> > I reread through the thread and IUCC, drop_index() is sending > inval on the owing relation and invalidation happens (that is, I finally understand that I was totally inept. This came from *the result* of the original -bug thread. Sorry for the noise. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Wed, 1 Aug 2018 09:25:18 -0700, Andres Freund <andres@anarazel.de> wrote in <20180801162518.jnb2ql5dfmgwp4qo@alap3.anarazel.de> > Hi, > > The issue at [1] is caused by missing invalidations, and [2] seems like > a likely candidate too. I wonder if it'd be good to have a relcache test > mode akin to CLOBBER_CACHE_ALWAYS and RELCACHE_FORCE_RELEASE, that tries > to ensure that we've done sufficiently to ensure the right invalidations > are sent. > > I think what we'd kind of want is to ensure that relcache entries are > rebuilt at the earliest possible time, but *not* later. That'd mean > they're out of date if there's missing invalidations. Unfortunately I'm > not clear on how that'd be achievable? Ideas? > > The best I can come up with is to code some additional dependencies into > CacheInvalidateHeapTuple(), and add tracking ensuring we've sent the > right messages. But that seems somewhat painful and filled with holes. > > [1] http://archives.postgresql.org/message-id/CAKoxK%2B5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf%3DHEQ%40mail.gmail.com > [2] https://www.postgresql.org/message-id/12259.1532117714@sss.pgh.pa.us FWIW, I revisit here. Maybe stupid, but does it make sense that we always build a new relcache entry in RelationIdGetRelation then "logically" compare it with the entry found in relcache? I'm not sure how referece count affects this, though.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Fri, Aug 3, 2018 at 12:34 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: >> I reread through the thread and IUCC, drop_index() is sending >> inval on the owing relation and invalidation happens (that is, > > I finally understand that I was totally inept. This came from > *the result* of the original -bug thread. I certainly would not say that you were in any way inept. Perhaps there is an argument for *also* doing what you propose. I am not dismissive of your idea. It seems like a question that should be considered separately, on another thread. If you still want to pursue it. -- Peter Geoghegan
Hello. At Fri, 3 Aug 2018 15:42:22 -0700, Peter Geoghegan <pg@bowt.ie> wrote in <CAH2-WzmXcHXa0MKx5a9NiaaCOE4E4T_rnaHa-N4gN-VoWUT8aw@mail.gmail.com> > On Fri, Aug 3, 2018 at 12:34 AM, Kyotaro HORIGUCHI > <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > >> I reread through the thread and IUCC, drop_index() is sending > >> inval on the owing relation and invalidation happens (that is, > > > > I finally understand that I was totally inept. This came from > > *the result* of the original -bug thread. > > I certainly would not say that you were in any way inept. Perhaps > there is an argument for *also* doing what you propose. I am not > dismissive of your idea. I jumped to the URL to see the thread but it was not up-to-date. It's the cause of my confusion that I thought the problem was the error seen *after* the invalidation fix patch in my repo. > It seems like a question that should be considered separately, on > another thread. If you still want to pursue it. Thanks. I might bring this again later. regards. -- Kyotaro Horiguchi NTT Open Source Software Center