Thread: \d t: ERROR: XX000: cache lookup failed for relation
Resending to -hackers https://www.postgresql.org/message-id/20180527022401.GA20949%40telsasoft.com Is that considered an actionable problem? Encountered consistently while trying to reproduce the vacuum full pg_statistic/toast_2619 bug; while running a loop around VAC FULL and more in another session: [1]- Running { time sh -ec 'while :; do psql --port 5678 postgres -qc "VACUUM FULL pg_toast.pg_toast_2619";psql --port 5678 postgres -qc "VACUUM FULL pg_statistic"; done'; date; } & [2]+ Running time while :; do psql postgres --port 5678 -c "INSERT INTO t SELECT i FROM generate_series(1,999999) i"; sleep 1; for a in `seq 999`; do psql postgres --port 5678 -c "ALTER TABLE t ALTER i TYPE int USING i::int"; sleep 1; psql postgres --port 5678 -c"ALTER TABLE t ALTER i TYPE bigint"; sleep 1; done; psql postgres --port 5678 -c "TRUNCATE t"; sleep 1; done & $ psql --port 5678 postgres -x psql (11beta1) ... postgres=# \set VERBOSITY verbose postgres=# \d t ERROR: XX000: cache lookup failed for relation 8096742 LOCATION: flatten_reloptions, ruleutils.c:11065 Justin
> Is that considered an actionable problem? I think so. but I'm not able to reproduce that, I wrote a script to simplify but it doesn't reproduce too. And how long to wait to reproduce? I waited for one hour -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
On Mon, Jun 04, 2018 at 07:12:53PM +0300, Teodor Sigaev wrote: > > >Is that considered an actionable problem? > > > I think so. but I'm not able to reproduce that, I wrote a script to simplify The failure is triggered by running "\d t" in (yet) another session - sorry if that was unclear. It fails very consistently, probably over 75% of the time. Also note that my "INSERT" was run in a separate loop, concurrent with the VACUUM and ALTER, but yours is running consecutively. Justin
> The failure is triggered by running "\d t" in (yet) another session - sorry if > that was unclear. It fails very consistently, probably over 75% of the time. No-no, I understood that. I tried \d in one more session. > > Also note that my "INSERT" was run in a separate loop, concurrent with the > VACUUM and ALTER, but yours is running consecutively. both loops run in backgound. I tried to run two scripts - and got a lot of deadlocks but not a probem reproduction. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Mon, Jun 04, 2018 at 08:01:41PM +0300, Teodor Sigaev wrote: > >Also note that my "INSERT" was run in a separate loop, concurrent with the > >VACUUM and ALTER, but yours is running consecutively. > > both loops run in backgound. I tried to run two scripts - and got a lot of > deadlocks but not a probem reproduction. Ah, I think this is the missing, essential component: CREATE INDEX ON t(right(i::text,1)) WHERE i::text LIKE '%1'; I can reproduce it running just this loop: time while :; do for a in `seq 999`; do psql postgres --port 5678 -c "ALTER TABLE t ALTER i TYPE int USING i::int"; done;done Justin
> Ah, I think this is the missing, essential component: > CREATE INDEX ON t(right(i::text,1)) WHERE i::text LIKE '%1'; Finally, I reproduce it with attached script. INSERT 0 999999 <- first insertion ERROR: cache lookup failed for relation 1032219 ALTER TABLE ERROR: cache lookup failed for relation 1033478 ALTER TABLE ERROR: cache lookup failed for relation 1034073 ALTER TABLE ERROR: cache lookup failed for relation 1034650 ALTER TABLE ERROR: cache lookup failed for relation 1035238 ALTER TABLE ERROR: cache lookup failed for relation 1035837 will investigate -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
Teodor Sigaev wrote: >> Ah, I think this is the missing, essential component: >> CREATE INDEX ON t(right(i::text,1)) WHERE i::text LIKE '%1'; > Finally, I reproduce it with attached script. In attachment simplified version of script. psql uses ordinary sql query to get info about index with usual transaction isolation/MVCC. To create a description of index it calls pg_get_indexdef() which doesn't use transaction snapshot, it uses catalog snapshot because it accesses to catalog through system catalog cache. So the difference is used snapshot between ordinary SQL query and pg_get_indexdef(). I'm not sure that easy to fix and should it be fixed at all. Simplified query: SELECT c2.relname, i.indexrelid, pg_catalog.pg_get_indexdef(i.indexrelid, 0, true) FROM pg_catalog.pg_class c, pg_catalog.pg_class c2, pg_catalog.pg_index i WHERE c.relname = 't' AND c.oid = i.indrelid AND i.indexrelid = c2.oid -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/