Re: DISCARD TEMP results in "ERROR: cache lookup failed for type 0" - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: DISCARD TEMP results in "ERROR: cache lookup failed for type 0" |
Date | |
Msg-id | 4709.1562365346@sss.pgh.pa.us Whole thread Raw |
In response to | DISCARD TEMP results in "ERROR: cache lookup failed for type 0" (Manuel Rigger <rigger.manuel@gmail.com>) |
Responses |
Re: DISCARD TEMP results in "ERROR: cache lookup failed for type 0"
|
List | pgsql-bugs |
Manuel Rigger <rigger.manuel@gmail.com> writes: > CREATE TEMP TABLE t0(c0 INT GENERATED ALWAYS AS IDENTITY) PARTITION BY > HASH((t0.c0)); > VACUUM FULL; > DISCARD TEMP; -- unexpected: ERROR: cache lookup failed for type 0 Blech. The proximate cause is fairly obvious from poking around with a debugger: #0 errfinish (dummy=0) at elog.c:414 #1 0x00000000008a2694 in elog_finish (elevel=<value optimized out>, fmt=<value optimized out>) at elog.c:1376 #2 0x00000000008898ae in get_typlenbyvalalign (typid=0, typlen=0x2d42118, typbyval=0x2d42138, typalign=0x2d42158 "") at lsyscache.c:2057 #3 0x000000000088c2e0 in RelationBuildPartitionKey (relation=0x7f727c583a80) at partcache.c:233 get_typlenbyvalalign(key->parttypid[i], ...); #4 0x00000000008957a7 in RelationBuildDesc ( targetRelId=<value optimized out>, insertIt=true) at relcache.c:1189 #5 0x0000000000897ce6 in RelationIdGetRelation (relationId=53072) at relcache.c:1956 #6 0x00000000004ca6b0 in relation_open (relationId=53072, lockmode=<value optimized out>) at heapam.c:1135 #7 0x0000000000535c89 in heap_drop_with_catalog (relid=53072) at heap.c:1828 #8 0x0000000000533074 in doDeletion (object=0x2d4dfe4, depRel=0x7ffd1d23b3d8, flags=<value optimized out>) at dependency.c:1134 #9 deleteOneObject (object=0x2d4dfe4, depRel=0x7ffd1d23b3d8, flags=<value optimized out>) at dependency.c:1036 #10 0x000000000053310f in deleteObjectsInList (targetObjects=0x2c367b8, depRel=0x7ffd1d23b3d8, flags=29) at dependency.c:260 #11 0x0000000000533328 in performDeletion (object=0x7ffd1d23b410, behavior=DROP_CASCADE, flags=29) at dependency.c:341 #12 0x000000000053d283 in RemoveTempRelations () at namespace.c:4140 ... (gdb) f 3 (gdb) p *relation->rd_att $2 = {natts = 1, tdtypeid = 53074, tdtypmod = -1, tdhasoid = false, tdrefcount = 1, constr = 0x0, attrs = 0x7f727c6f68e0} (gdb) p *relation->rd_att->attrs $3 = {attrelid = 53072, attname = { data = "........pg.dropped.1........", '\000' <repeats 35 times>}, atttypid = 0, attstattarget = 0, attlen = 4, attnum = 1, attndims = 0, attcacheoff = 0, atttypmod = -1, attbyval = true, attstorage = 112 'p', attalign = 105 'i', attnotnull = false, atthasdef = false, atthasmissing = false, attidentity = 97 'a', attisdropped = true, attislocal = true, attinhcount = 0, attcollation = 0} So we're getting the failure because, when we come to delete the whole temp table, its partitioning column has already been dropped, causing RelationBuildPartitionKey (and hence any relation-opening operation) to fail. That means that order-of-operations in dependency.c is at the root of the issue. The relevant pg_depend entries are (your OIDs will vary): # select objid, pg_describe_object(classid,objid,objsubid) as obj, pg_describe_object(refclassid,refobjid,refobjsubid) as ref, deptype from pg_depend where ...; objid | obj | ref | deptype -------+--------------------+-----------------------+--------- 53071 | type t0_c0_seq | sequence t0_c0_seq | i 53070 | sequence t0_c0_seq | schema pg_temp_3 | n 53074 | type t0 | table t0 | i 53073 | type t0[] | type t0 | i 53072 | table t0 | schema pg_temp_3 | n 53070 | sequence t0_c0_seq | column c0 of table t0 | i (6 rows) deleteObjectsInList's target list is (gdb) p *targetObjects $1 = {refs = 0x2d87450, extras = 0x2d2c8f8, numrefs = 7, maxrefs = 32} (gdb) p targetObjects->refs[0] $2 = {classId = 1247, objectId = 53071, objectSubId = 0} (gdb) p targetObjects->refs[1] $3 = {classId = 1259, objectId = 53070, objectSubId = 0} (gdb) p targetObjects->refs[2] $4 = {classId = 1259, objectId = 53072, objectSubId = 1} (gdb) p targetObjects->refs[3] $5 = {classId = 1247, objectId = 53073, objectSubId = 0} (gdb) p targetObjects->refs[4] $6 = {classId = 1247, objectId = 53074, objectSubId = 0} (gdb) p targetObjects->refs[5] $7 = {classId = 1259, objectId = 53072, objectSubId = 0} (gdb) p targetObjects->refs[6] $8 = {classId = 2615, objectId = 16982, objectSubId = 0} So what's evidently happening is that it lands on the sequence first and follows that to column c0, making a deletion target entry (refs[2]) for that column, and only afterwards finds the whole table and makes a deletion target entry for the table (refs[5]). The behavior is dependent on the order of entries in pg_depend, so you might or might not see it (and that's why the VACUUM FULL contributes; it's changing the entry order). The example doesn't reproduce in v12/HEAD, but I fear that that is only a cosmetic side-effect of commit f1ad067fc (Sort the dependent objects before recursing in findDependentObjects()). [ pokes at it... ] Yup, you can break it in HEAD too, if you arrange for the sequence and the table to have different relative OIDs: regression=# create temp table t0(c0 int not null) partition by range(c0); CREATE TABLE regression=# alter table t0 alter column c0 add generated always as identity; ALTER TABLE regression=# select objid, pg_describe_object(classid,objid,objsubid) as obj, pg_describe_object(refclassid,refobjid,refobjsubid)as ref, deptype from pg_depend where ... objid | obj | ref | deptype -------+--------------------+-----------------------+--------- 43530 | type t0 | table t0 | i 43529 | type t0[] | type t0 | i 43528 | table t0 | schema pg_temp_3 | n 43534 | type t0_c0_seq | sequence t0_c0_seq | i 43533 | sequence t0_c0_seq | schema pg_temp_3 | n 43533 | sequence t0_c0_seq | column c0 of table t0 | i (6 rows) regression=# discard temp; psql: ERROR: cache lookup failed for type 0 ISTM the real fix here probably involves having some explicit dependencies between the table and its partitioning columns, rather than relying on this check: regression=# alter table t0 drop column c0; psql: ERROR: cannot drop column named in partition key regression=# \errverbose psql: error: ERROR: 42P16: cannot drop column named in partition key LOCATION: ATExecDropColumn, tablecmds.c:7041 which is evidently not even in the right place to guard this with any security. With an explicit dependency, we could force a drop of a partitioning column to be turned into a drop of the whole table. Without one, well, you can break it really easily: regression=# create domain d1 as int; CREATE DOMAIN regression=# create table t1 (f1 d1) partition by range(f1); CREATE TABLE regression=# drop domain d1 cascade; psql: NOTICE: drop cascades to column f1 of table t1 DROP DOMAIN regression=# \d t1 psql: ERROR: cache lookup failed for type 0 However, we can only apply a fix like that in HEAD and maybe v12; it seems too late to be adding such dependencies in v10/v11. Not sure what to do about those branches. regards, tom lane
pgsql-bugs by date: