Thread: table unreadable after altering related table embedded via a view
Hi there,
After altering a table definition with the aim of promoting an int column to bigint, I found that a related table (that references the original table's definition by way of a view) became unreadable. Selecting from the table errors out with "record type has not been registered", pg_dump also fails to dump out the data with the same error. Data can still be inserted but not read back. I have included a minimal example below.
Interestingly, if I change the view in the example into a regular table the error does not appear, so the problem seems specific in some way to usage of a view's record type as a column. Also, I can select from the table fine in the session that altered the table, it's only subsequent sessions that start seeing the error.
Postgres 14.1 on Ubuntu 20.04 (though I got the same result with 13.2 on CentOS 7)
Cheers,
Miles
---
#!/bin/bash
set -x
createdb mytest
psql mytest <<EOF
--- foo is the main table
create table foo (data int);
--- foo_view embellishes the main table with some other data (omitted here for clarity)
create view foo_view as select foo from foo;
--- foo_log records changes to the view
create table foo_log (ts timestamp, what foo_view);
--- change data from int to bigint
alter table foo add column big_data bigint;
update foo set big_data = data;
update foo_log set what.foo.big_data = (what).foo.data;
alter table foo drop column data;
alter table foo rename column big_data to data;
--- insert a row into foo and foo_log
insert into foo values (123456);
insert into foo_log select current_timestamp, foo_view from foo_view;
--- we can still select from foo_log in this session
select * from foo_log;
EOF
psql mytest <<EOF
\set verbosity verbose
--- but this one errors out
select * from foo_log;
EOF
--- Output ---
$ ./record_type_not_registered.sh
+ createdb mytest+ psql mytest
CREATE TABLE
CREATE VIEW
CREATE TABLE
ALTER TABLE
UPDATE 0
UPDATE 0
ALTER TABLE
ALTER TABLE
INSERT 0 1
INSERT 0 1
ts | what
----------------------------+--------------
2021-11-22 13:29:46.775704 | ("(123456)")
(1 row)
+ psql mytest
ERROR: record type has not been registered
--- Server log ---
2021-11-22 13:29:44.217 AEDT [2291] LOG: 00000: starting PostgreSQL 14.1 on x86_64-pc-linux-gnu, compiled by gcc-9 (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
2021-11-22 13:29:44.217 AEDT [2291] LOCATION: PostmasterMain, postmaster.c:1128
2021-11-22 13:29:44.217 AEDT [2291] LOG: 00000: listening on IPv4 address "127.0.0.1", port 5432
2021-11-22 13:29:44.217 AEDT [2291] LOCATION: StreamServerPort, pqcomm.c:582
2021-11-22 13:29:44.223 AEDT [2291] LOG: 00000: listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-11-22 13:29:44.223 AEDT [2291] LOCATION: StreamServerPort, pqcomm.c:577
2021-11-22 13:29:44.230 AEDT [2292] LOG: 00000: database system was shut down at 2021-11-22 13:29:40 AEDT
2021-11-22 13:29:44.230 AEDT [2292] LOCATION: StartupXLOG, xlog.c:6536
2021-11-22 13:29:44.235 AEDT [2291] LOG: 00000: database system is ready to accept connections
2021-11-22 13:29:44.235 AEDT [2291] LOCATION: reaper, postmaster.c:3066
2021-11-22 13:29:46.781 AEDT [2356] ERROR: 42809: record type has not been registered
2021-11-22 13:29:46.781 AEDT [2356] LOCATION: lookup_rowtype_tupdesc_internal, typcache.c:1809
2021-11-22 13:29:46.781 AEDT [2356] STATEMENT: select * from foo_log;
2021-11-22 13:29:44.217 AEDT [2291] LOCATION: PostmasterMain, postmaster.c:1128
2021-11-22 13:29:44.217 AEDT [2291] LOG: 00000: listening on IPv4 address "127.0.0.1", port 5432
2021-11-22 13:29:44.217 AEDT [2291] LOCATION: StreamServerPort, pqcomm.c:582
2021-11-22 13:29:44.223 AEDT [2291] LOG: 00000: listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-11-22 13:29:44.223 AEDT [2291] LOCATION: StreamServerPort, pqcomm.c:577
2021-11-22 13:29:44.230 AEDT [2292] LOG: 00000: database system was shut down at 2021-11-22 13:29:40 AEDT
2021-11-22 13:29:44.230 AEDT [2292] LOCATION: StartupXLOG, xlog.c:6536
2021-11-22 13:29:44.235 AEDT [2291] LOG: 00000: database system is ready to accept connections
2021-11-22 13:29:44.235 AEDT [2291] LOCATION: reaper, postmaster.c:3066
2021-11-22 13:29:46.781 AEDT [2356] ERROR: 42809: record type has not been registered
2021-11-22 13:29:46.781 AEDT [2356] LOCATION: lookup_rowtype_tupdesc_internal, typcache.c:1809
2021-11-22 13:29:46.781 AEDT [2356] STATEMENT: select * from foo_log;
Le lundi 22 novembre 2021, 03:44:37 CET Miles Delahunty a écrit : > $ ./record_type_not_registered.sh > + createdb mytest > + psql mytest > CREATE TABLE > CREATE VIEW > CREATE TABLE > ALTER TABLE > UPDATE 0 > UPDATE 0 > ALTER TABLE > ALTER TABLE > INSERT 0 1 > INSERT 0 1 > ts | what > ----------------------------+-------------- > 2021-11-22 13:29:46.775704 | ("(123456)") > (1 row > > + psql mytest > ERROR: record type has not been registered > I started to take a look at this, and while I haven't gotten to the bottom of it yet, here are my observations. What I notice is that every field of the view type inserted before the change is correctly typed as a composite, with type = foo_view, and it's content is a composite, with type = foo. But after the change, a newly inserted tuple in foo_log is still correctly identified as a composite of type foo_view, but it's content is now typed as anonymous record (datum_typeid=2249, c9080000 on disk). Here is the slightly changed test case showing the difference in the stored datum_typeid stored before and after the base table change: create extension pageinspect; CREATE EXTENSION --- foo is the main table create table foo (data int); CREATE TABLE create view foo_view as select foo from foo; CREATE VIEW create table foo_log (id serial, what foo_view); CREATE TABLE insert into foo values (1); INSERT 0 1 insert into foo_log (what) select foo_view from foo_view; INSERT 0 1 -- Now check the oids of various types. SELECT typname, oid FROM pg_type WHERE typname in ('record', 'foo'); typname | oid ---------+-------- record | 2249 foo | 154218 (2 rows) -- Check what is the type stored for the composited type of what.foo -- It is the encoded representation of the oid of foo. select v from (select (encode(substr(t_data, 31, 4), 'hex')) as v from heap_page_items(get_raw_page('foo_log', 0))) t; v ---------- 6a5a0200 (1 row) alter table foo add column big_data bigint; ALTER TABLE update foo set big_data = data; UPDATE 1 update foo_log set what.foo.big_data = (what).foo.data; UPDATE 1 alter table foo drop column data; ALTER TABLE alter table foo rename column big_data to data; ALTER TABLE insert into foo values (2); INSERT 0 1 insert into foo_log (what) select foo_view from foo_view where (foo).data = 2; INSERT 0 1 -- Now perform the same check -- The first tuple, and it's updated versions both have the correct values for -- the type of what.foo, but the newly inserted one has 'c9080000' which is the -- oid for record. select v from (select (encode(substr(t_data, 31, 4), 'hex')) as v from heap_page_items(get_raw_page('foo_log', 0))) t; v ---------- 6a5a0200 6a5a0200 c9080000 (3 rows) + psql -a mytest \set verbosity verbose -- This one is ok select * from foo_log where id = 1; id | what ----+--------- 1 | ("(1)") (1 row) -- This one is unreadable select * from foo_log where id = 2; ERROR: record type has not been registered -- Ronan Dunklau
Ronan Dunklau <ronan.dunklau@aiven.io> writes: > Le lundi 22 novembre 2021, 03:44:37 CET Miles Delahunty a écrit : >> ERROR: record type has not been registered > I started to take a look at this, and while I haven't gotten to the bottom of > it yet, here are my observations. Thanks for poking at it! I dug a bit further and found that the proximate cause of the problem is ExecTypeSetColNames, which is deciding that the column names of a wholerow Var have changed and then changing the output tuple type to RECORD. Now, in the example as given, the *live* column names haven't changed. So one problem is that ExecTypeSetColNames isn't expecting that the tupdesc's attisdropped columns might not exactly line up with the colnames list it's given. We could hack that up to work, but if you change the example so that the names don't match, say by skipping the "rename big_data to data" step, the failure comes right back. The reason the passed-in colnames list doesn't match is that it's from the view's stored RTE for foo, which still has the original colnames list of just ("data"), with no allowance for any adjustments made after the view was created. This code all comes from commit bf7ca1587. I think changing the column names to match the RTE is essential if we want to have the desired semantics for the examples in that commit. And once we do that, the tuples are indeed not of the original rowtype, so switching to RECORD seems unavoidable. The problem is that we're allowing the now-RECORD tuples to be stored back into a table. I think we've got to throw an error instead. But we don't, because the code that's responsible for checking that is looking at the Var in the query's tlist and seeing that it claims to emit values of the appropriate rowtype. So it seems to me that this is a basic design error in bf7ca1587 (and hence my fault :-(). We should not be trying to hack up the values' rowtype in the executor; that has to happen earlier, probably in the planner, and then we must fix the whole-row Var to tell the truth about which rowtype it will emit. I'll work on a fix, but it's probably not a trivial patch. regards, tom lane