Thread: [BUGS] BUG #14769: Logical replication error "cache lookup failed for type 0"
[BUGS] BUG #14769: Logical replication error "cache lookup failed for type 0"
From
scott@deltaex.com
Date:
The following bug has been logged on the website: Bug reference: 14769 Logged by: Scott Milliken Email address: scott@deltaex.com PostgreSQL version: 10beta2 Operating system: Linux 4.10.0-27-generic #30~16.04.2-Ubuntu Description: I'm testing postgresql 10beta2, and found a bug with logical replication. The bug halts logical replication, and produces the error "cache lookup failed for type 0" in the secondary's log. The error can be reproduced by dropping a column in a replicated table. When a column is dropped, it zeroes out attypeid in pg_attribute: ``` test=# select * from pg_attribute where attrelid = 'public.testtbl'::regclass and atttypid = 0 order by attnum; -[ RECORD 1 ]-+----------------------------- attrelid | 16386 attname | ........pg.dropped.2........ atttypid | 0 attstattarget | 0 attlen | -1 attnum | 2 attndims | 0 attcacheoff | -1 atttypmod | -1 attbyval | f attstorage | x attalign | i attnotnull | f atthasdef | f attidentity | attisdropped | t attislocal | t attinhcount | 0 attcollation | 100 attacl | ¤ attoptions | ¤ attfdwoptions | ¤ ``` Perhaps there's a missing check against pg_attribute.attisdropped somewhere? Here's full reproduction steps: Primary: ``` mkdir -p /tmp/test0 PGPORT=5301 initdb -D /tmp/test0 echo "wal_level = logical" >> /tmp/test0/postgresql.conf PGPORT=5301 pg_ctl -D /tmp/test0 start for (( ; ; )); do if pg_isready -d postgres -p 5301; then break; fi; sleep 1; done psql -p 5301 postgres -c "CREATE USER test WITH PASSWORD 'test' SUPERUSER CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;" createdb -p 5301 -E utf8 test psql -p 5301 -U test test -c "CREATE TABLE testtbl (id int, name text);" psql -p 5301 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT testtbl_pkey PRIMARY KEY (id);" psql -p 5301 -U test test -c "CREATE PUBLICATION testpub FOR TABLE testtbl;" psql -p 5301 -U test test -c "INSERT INTO testtbl (id, name) VALUES (1, 'a');" # continue with "Secondary" instructions psql -p 5301 -U test test -c "UPDATE testtbl SET name = 'b';" # continue with "Secondary" instructions psql -p 5301 -U test test -c "ALTER TABLE testtbl DROP COLUMN name;" psql -p 5301 -U test test -c "ALTER TABLE testtbl ADD COLUMN name2 text;" psql -p 5301 -U test test -c "UPDATE testtbl SET name2 = 'b';" # continue with "Secondary" instructions ``` Secondary: ``` mkdir -p /tmp/test1 PGPORT=5302 initdb -D /tmp/test1 PGPORT=5302 pg_ctl -D /tmp/test1 start for (( ; ; )); do if pg_isready -d postgres -p 5302; then break; fi; sleep 1; done psql -p 5302 postgres -c "CREATE USER test WITH PASSWORD 'test' SUPERUSER CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;" createdb -p 5302 -E utf8 test psql -p 5302 -U test test -c "CREATE TABLE testtbl (id int, name text);" psql -p 5302 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT testtbl_pkey PRIMARY KEY (id);" psql -p 5302 -U test test -c "CREATE SUBSCRIPTION test0_testpub CONNECTION 'port=5301 user=test dbname=test' PUBLICATION testpub;" psql -p 5302 -U test test -c "SELECT * FROM testtbl;" # OK # continue with "Primary" instructions psql -p 5302 -U test test -c "SELECT * FROM testtbl;" # OK psql -p 5302 -U test test -c "ALTER TABLE testtbl DROP COLUMN name;" psql -p 5302 -U test test -c "ALTER TABLE testtbl ADD COLUMN name2 text;" # continue with "Primary" instructions psql -p 5302 -U test test -c "SELECT * FROM testtbl;" # not OK, doesn't receive update, errors in log ``` Here's the log: ``` 2017-07-23 22:42:28.719 PDT [7706] LOG: logical replication apply worker for subscription "test0_testpub" has started 2017-07-23 22:42:28.725 PDT [7706] ERROR: cache lookup failed for type 0 2017-07-23 22:42:28.725 PDT [7706] CONTEXT: processing remote data for replication target relation "public.testtbl" column "id", remote type integer, local type integer 2017-07-23 22:42:28.725 PDT [3359] LOG: worker process: logical replication worker for subscription 16395 (PID 7706) exited with exit code 1 ``` -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #14769: Logical replication error "cache lookup failedfor type 0"
From
Petr Jelinek
Date:
On 03/08/17 01:59, scott@deltaex.com wrote: > The following bug has been logged on the website: > > Bug reference: 14769 > Logged by: Scott Milliken > Email address: scott@deltaex.com > PostgreSQL version: 10beta2 > Operating system: Linux 4.10.0-27-generic #30~16.04.2-Ubuntu > Description: > > I'm testing postgresql 10beta2, and found a bug with logical replication. > The bug halts logical replication, and produces the error "cache lookup > failed for type 0" in the secondary's log. > > The error can be reproduced by dropping a column in a replicated table. When > a column is dropped, it zeroes out attypeid in pg_attribute: > > ``` > test=# select * from pg_attribute where attrelid = > 'public.testtbl'::regclass and atttypid = 0 order by attnum; > -[ RECORD 1 ]-+----------------------------- > attrelid | 16386 > attname | ........pg.dropped.2........ > atttypid | 0 > attstattarget | 0 > attlen | -1 > attnum | 2 > attndims | 0 > attcacheoff | -1 > atttypmod | -1 > attbyval | f > attstorage | x > attalign | i > attnotnull | f > atthasdef | f > attidentity | > attisdropped | t > attislocal | t > attinhcount | 0 > attcollation | 100 > attacl | ¤ > attoptions | ¤ > attfdwoptions | ¤ > ``` > > Perhaps there's a missing check against pg_attribute.attisdropped > somewhere? > Good catch! Looks like we didn't initialize the attribute map for locally dropped columns. Once I did that I found there is also typo in slot_modify_cstrings() which uses wrong value from the remote side, that is also made visible by your example. Both are fixed in attached patch. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Attachment
Re: [BUGS] BUG #14769: Logical replication error "cache lookup failedfor type 0"
From
Peter Eisentraut
Date:
On 8/4/17 05:41, Petr Jelinek wrote: >> Perhaps there's a missing check against pg_attribute.attisdropped >> somewhere? >> > > Good catch! Looks like we didn't initialize the attribute map for > locally dropped columns. fixed > Once I did that I found there is also typo in > slot_modify_cstrings() which uses wrong value from the remote side, that > is also made visible by your example. and fixed and added a test case -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #14769: Logical replication error "cache lookup failedfor type 0"
From
Scott Milliken
Date:
Thanks Petr and Peter! I've tested commit and confirmed that commit 0e58455d fixes this bug.
On Mon, Aug 7, 2017 at 7:54 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 8/4/17 05:41, Petr Jelinek wrote:
>> Perhaps there's a missing check against pg_attribute.attisdropped
>> somewhere?
>>
>
> Good catch! Looks like we didn't initialize the attribute map for
> locally dropped columns.
fixed
> Once I did that I found there is also typo in
> slot_modify_cstrings() which uses wrong value from the remote side, that
> is also made visible by your example.
and fixed and added a test case
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services