Thread: BUG #15528: on v11.0 version still get error "ERROR: catalog ismissing 1 attribute(s) for relid 6855092"
BUG #15528: on v11.0 version still get error "ERROR: catalog ismissing 1 attribute(s) for relid 6855092"
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 15528 Logged by: zhou xiaowei Email address: 110876189@qq.com PostgreSQL version: 11.0 Operating system: Linux 3.10.0-327.36.58.4.x86_64 Description: Hi,my friend. I dump out 27969 tables from PG v9.6.8 one by one, use "pg_dump -t xxx -f xxx.sql". Then start 300 threads to parallel import these .sql into v11.0. after data imported completely, parallel create constraint for every table. when all completely, execute "vacuum full",I got error "ERROR: catalog is missing 1 attribute(s) for relid 6855092", which occured on 11beta2 solved by BUG #15309. Today, I want to make sure which step cause this issue, I only start 300 threads to parallel import these .sql, not create constraint for them. when execute "vacuum full", I got error "ERROR: duplicate key value violates unique constraint "pg_class_relname_nsp_index"" the details: ERROR: duplicate key value violates unique constraint "pg_class_relname_nsp_index" DETAIL: Key (relname, relnamespace)=(pg_toast_17589_index, 99) already exists. STATEMENT: vacuum full; after "vacuum full" error,I got different info about pg_toast_17589_index from pg_class: ncc_1126=# select relname from pg_class where relname = 'pg_toast_17589_index'; relname ---------------------- pg_toast_17589_index pg_toast_17589_index (2 rows) ncc_1126=# select * from pg_class where relname = 'pg_toast_17589_index'; relname | relnamespace | reltype | reloftype | relowner | relam | relfilenode | reltablespace | relpages | reltuples | relallvisible | reltoastrelid | relhasindex | relisshared | relpersistence | relkind | relnatts | relchecks | relhasoids | relhasrules | relhastriggers | relhassubclass | relrowsecurity | relforcerowsecurity | relispopulated | relreplident | relispartition | relrewrite | relfrozenxid | relminmxid | relacl | reloptions | relpartbound ----------------------+--------------+---------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+------------+-------------+----------------+----------------+----------------+---------------------+----------------+--------------+----------------+------------+--------------+------------+--------+------------+-------------- pg_toast_17589_index | 99 | 0 | 0 | 16396 | 403 | 1739800 | 0 | 1 | 0 | 0 | 0 | f | f | p | i | 2 | 0 | f | f | f | f | f | f | t | n | f | 0 | 0 | 0 | | | pg_toast_178529 | 99 | 178537 | 0 | 16396 | 0 | 1807123 | 0 | 0 | 0 | 0 | 0 | t | f | p | t | 3 | 0 | f | f | f | f | f | f | t | n | f | 0 | 96029 | 1 | | | (2 rows) two querys got different content for column 'relname'. .sql's content as follow: -- -- PostgreSQL database dump -- -- Dumped from database version 9.6.8 -- Dumped by pg_dump version 9.6.8 SET statement_timeout = 0; SET lock_timeout = 0; SET idle_in_transaction_session_timeout = 0; SET client_encoding = 'UTF8'; SET standard_conforming_strings = on; SELECT pg_catalog.set_config('search_path', '', false); SET check_function_bodies = false; SET client_min_messages = warning; SET row_security = off; SET default_tablespace = ''; SET default_with_oids = false; -- -- Name: hrtm_myquestion; Type: TABLE; Schema: ncc_gh; Owner: ncc_gh -- CREATE TABLE ncc_gh.hrtm_myquestion ( xxxx ); ALTER TABLE ncc_gh.hrtm_myquestion OWNER TO ncc_gh; -- -- Data for Name: hrtm_myquestion; Type: TABLE DATA; Schema: ncc_gh; Owner: ncc_gh -- COPY ncc_gh.hrtm_myquestion (...... \. -- -- PostgreSQL database dump complete -- what happened? is this a bug? To get more details for this issue, how can I do it?
Re: BUG #15528: on v11.0 version still get error "ERROR: catalog is missing 1 attribute(s) for relid 6855092"
From
Tom Lane
Date:
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes: > To get more details for this issue, how can I do it? It sounds like you have a script for causing this problem, would you share that? regards, tom lane
Re: BUG #15528: on v11.0 version still get error "ERROR: catalog is missing 1 attribute(s) for relid 6855092"
From
"小威"
Date:
all script same as attached .sql file.these .sql are customer's data,I'm forbided to get out.
PG Bug reporting form <noreply@postgresql.org> writes:
> To get more details for this issue, how can I do it?
It sounds like you have a script for causing this problem,
would you share that?
regards, tom lane
---Original---
From: "Tom Lane"<tgl@sss.pgh.pa.us>
Date: Fri, Nov 30, 2018 23:01 PM
To: "110876189"<110876189@qq.com>;
Cc: "pgsql-bugs"<pgsql-bugs@lists.postgresql.org>;
Subject: Re: BUG #15528: on v11.0 version still get error "ERROR: catalog is missing 1 attribute(s) for relid 6855092"
PG Bug reporting form <noreply@postgresql.org> writes:
> To get more details for this issue, how can I do it?
It sounds like you have a script for causing this problem,
would you share that?
regards, tom lane
Re: BUG #15528: on v11.0 version still get error "ERROR: catalog ismissing 1 attribute(s) for relid 6855092"
From
Peter Geoghegan
Date:
On Fri, Nov 30, 2018 at 12:17 AM PG Bug reporting form <noreply@postgresql.org> wrote: > I dump out 27969 tables from PG v9.6.8 one by one, use "pg_dump -t xxx -f > xxx.sql". Then start 300 threads to parallel import these .sql into > v11.0. > after data imported completely, parallel create constraint for every > table. > when all completely, execute "vacuum full",I got error "ERROR: catalog is > missing 1 attribute(s) for relid 6855092", which occured on 11beta2 solved > by BUG #15309. BUG #15309 could potentially have quite a few symptoms, including this one. > Today, I want to make sure which step cause this issue, I only start 300 > threads to parallel import these .sql, not create constraint for them. It seems like you're asking about how to reproduce corruption with BUG #15309 on a pre-release version of Postgres 11 (a v11 without the fix -- commit 9353d94a). I believe that you are not actually reporting a new bug. Is this understanding correct? Is there a new bug? Either way, you might find it interesting to see the result of this query, which relies on the v11 amcheck extension (so "CREATE EXTENSION amcheck" first): SELECT bt_index_parent_check(index => c.oid, heapallindexed => true), c.relname, c.relpages FROM pg_index i JOIN pg_opclass op ON i.indclass[0] = op.oid JOIN pg_am am ON op.opcmethod = am.oid JOIN pg_class c ON i.indexrelid = c.oid JOIN pg_namespace n ON c.relnamespace = n.oid WHERE am.amname = 'btree' AND n.nspname = 'pg_catalog' -- Don't check temp tables, which may be from another session: AND c.relpersistence != 't' -- Function may throw an error when this is omitted: AND c.relkind = 'i' AND i.indisready AND i.indisvalid ORDER BY c.relpages DESC; If that doesn't raise an error, you could try the same query, but remove "AND n.nspname = 'pg_catalog'". That will take considerably longer, but probably won't be intolerable. -- Peter Geoghegan
Re: BUG #15528: on v11.0 version still get error "ERROR: catalog ismissing 1 attribute(s) for relid 6855092"
From
Michael Paquier
Date:
On Sat, Dec 01, 2018 at 10:49:41AM +0800, 小威 wrote: > All script same as attached .sql file.these .sql are customer's > data,I'm forbided to get out. A self-contained test case is most welcome in those cases, particularly for such class of bugs. It could be possible to filter out any sensitive data with a similar schema, or fake data. What you are proposing upthread has very limited information, and the schema of the relation(s) involved can become very important. -- Michael
Attachment
Re: BUG #15528: on v11.0 version still get error "ERROR: catalog is missing 1 attribute(s) for relid 6855092"
From
"小威"
Date:
the database no inclulde amcheck extension.so I don't care how to product BUG#15039 by amcheck.I only worry my issue has some relation with it.next week,I'll continue to work on my issue.do you have some suggest to help quickly demarcate it?I have no idea now,except goto do many test.
On Fri, Nov 30, 2018 at 12:17 AM PG Bug reporting form
<noreply@postgresql.org> wrote:
> I dump out 27969 tables from PG v9.6.8 one by one, use "pg_dump -t xxx -f
> xxx.sql". Then start 300 threads to parallel import these .sql into
> v11.0.
> after data imported completely, parallel create constraint for every
> table.
> when all completely, execute "vacuum full",I got error "ERROR: catalog is
> missing 1 attribute(s) for relid 6855092", which occured on 11beta2 solved
> by BUG #15309.
BUG #15309 could potentially have quite a few symptoms, including this one.
> Today, I want to make sure which step cause this issue, I only start 300
> threads to parallel import these .sql, not create constraint for them.
It seems like you're asking about how to reproduce corruption with BUG
#15309 on a pre-release version of Postgres 11 (a v11 without the fix
-- commit 9353d94a). I believe that you are not actually reporting a
new bug. Is this understanding correct? Is there a new bug?
Either way, you might find it interesting to see the result of this
query, which relies on the v11 amcheck extension (so "CREATE EXTENSION
amcheck" first):
SELECT bt_index_parent_check(index => c.oid, heapallindexed => true),
c.relname,
c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree' AND n.nspname = 'pg_catalog'
-- Don't check temp tables, which may be from another session:
AND c.relpersistence != 't'
-- Function may throw an error when this is omitted:
AND c.relkind = 'i' AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC;
If that doesn't raise an error, you could try the same query, but
remove "AND n.nspname = 'pg_catalog'". That will take considerably
longer, but probably won't be intolerable.
--
Peter Geoghegan
---Original---
From: "Peter Geoghegan"<pg@bowt.ie>
Date: Sat, Dec 1, 2018 11:25 AM
To: "PostgreSQL mailing lists"<pgsql-bugs@lists.postgresql.org>;"110876189"<110876189@qq.com>;
Subject: Re: BUG #15528: on v11.0 version still get error "ERROR: catalog is missing 1 attribute(s) for relid 6855092"
On Fri, Nov 30, 2018 at 12:17 AM PG Bug reporting form
<noreply@postgresql.org> wrote:
> I dump out 27969 tables from PG v9.6.8 one by one, use "pg_dump -t xxx -f
> xxx.sql". Then start 300 threads to parallel import these .sql into
> v11.0.
> after data imported completely, parallel create constraint for every
> table.
> when all completely, execute "vacuum full",I got error "ERROR: catalog is
> missing 1 attribute(s) for relid 6855092", which occured on 11beta2 solved
> by BUG #15309.
BUG #15309 could potentially have quite a few symptoms, including this one.
> Today, I want to make sure which step cause this issue, I only start 300
> threads to parallel import these .sql, not create constraint for them.
It seems like you're asking about how to reproduce corruption with BUG
#15309 on a pre-release version of Postgres 11 (a v11 without the fix
-- commit 9353d94a). I believe that you are not actually reporting a
new bug. Is this understanding correct? Is there a new bug?
Either way, you might find it interesting to see the result of this
query, which relies on the v11 amcheck extension (so "CREATE EXTENSION
amcheck" first):
SELECT bt_index_parent_check(index => c.oid, heapallindexed => true),
c.relname,
c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree' AND n.nspname = 'pg_catalog'
-- Don't check temp tables, which may be from another session:
AND c.relpersistence != 't'
-- Function may throw an error when this is omitted:
AND c.relkind = 'i' AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC;
If that doesn't raise an error, you could try the same query, but
remove "AND n.nspname = 'pg_catalog'". That will take considerably
longer, but probably won't be intolerable.
--
Peter Geoghegan
Re: BUG #15528: on v11.0 version still get error "ERROR: catalog ismissing 1 attribute(s) for relid 6855092"
From
Peter Geoghegan
Date:
., s On Fri, Nov 30, 2018 at 8:52 PM 小威 <zxw110876189@qq.com> wrote: > the database no inclulde amcheck extension.so I don't care how to product BUG#15039 by amcheck.I only worry my issue hassome relation with it.next week,I'll continue to work on my issue.do you have some suggest to help quickly demarcate it?Ihave no idea now,except goto do many test. Sorry, I do not understand your questions. You have not told us if you believe there is a new bug or not. Do you? What you have shown at least looks very much like BUG#15039. It also seems like your problem is probably down to BUG#15039 because you said you were using a beta version of 11 at one point, and because your test case involved a VACUUM FULL of a system catalog where parallel CREATE INDEX is used (i.e. the specific factors required for BUG#15039 to cause corruption). What is your goal? -- Peter Geoghegan
Re: BUG #15528: on v11.0 version still get error "ERROR: catalog is missing 1 attribute(s) for relid 6855092"
From
"小威"
Date:
I'm working on 11.0, no beta version.and my db include BUG#15039 patch,but still occur this issue. actually,I don't known what happened.I only sure there still has a bug in my case.
., s
On Fri, Nov 30, 2018 at 8:52 PM 小威 <zxw110876189@qq.com> wrote:
> the database no inclulde amcheck extension.so I don't care how to product BUG#15039 by amcheck.I only worry my issue has some relation with it.next week,I'll continue to work on my issue.do you have some suggest to help quickly demarcate it?I have no idea now,except goto do many test.
Sorry, I do not understand your questions.
You have not told us if you believe there is a new bug or not. Do you?
What you have shown at least looks very much like BUG#15039. It also
seems like your problem is probably down to BUG#15039 because you said
you were using a beta version of 11 at one point, and because your
test case involved a VACUUM FULL of a system catalog where parallel
CREATE INDEX is used (i.e. the specific factors required for BUG#15039
to cause corruption).
What is your goal?
--
Peter Geoghegan
---Original---
From: "Peter Geoghegan"<pg@bowt.ie>
Date: Sat, Dec 1, 2018 13:03 PM
To: "zxw110876189"<zxw110876189@qq.com>;
Cc: "PostgreSQL mailing lists"<pgsql-bugs@lists.postgresql.org>;
Subject: Re: BUG #15528: on v11.0 version still get error "ERROR: catalog is missing 1 attribute(s) for relid 6855092"
., s
On Fri, Nov 30, 2018 at 8:52 PM 小威 <zxw110876189@qq.com> wrote:
> the database no inclulde amcheck extension.so I don't care how to product BUG#15039 by amcheck.I only worry my issue has some relation with it.next week,I'll continue to work on my issue.do you have some suggest to help quickly demarcate it?I have no idea now,except goto do many test.
Sorry, I do not understand your questions.
You have not told us if you believe there is a new bug or not. Do you?
What you have shown at least looks very much like BUG#15039. It also
seems like your problem is probably down to BUG#15039 because you said
you were using a beta version of 11 at one point, and because your
test case involved a VACUUM FULL of a system catalog where parallel
CREATE INDEX is used (i.e. the specific factors required for BUG#15039
to cause corruption).
What is your goal?
--
Peter Geoghegan
Re: BUG #15528: on v11.0 version still get error "ERROR: catalog ismissing 1 attribute(s) for relid 6855092"
From
Peter Geoghegan
Date:
On Fri, Nov 30, 2018 at 9:19 PM 小威 <zxw110876189@qq.com> wrote: > I'm working on 11.0, no beta version.and my db include BUG#15039 patch,but still occur this issue. actually,I don't knownwhat happened.I only sure there still has a bug in my case. I think that you'll probably need to provide a script to reproduce the issue. Can you make the data anonymous before sending? Alternatively, you can send it off-list. I promise that I will treat the application specific parts as confidential. Using amcheck would let us know if the index is completely corrupt (e.g. tuples are in the wrong order), or if the duplicate is garbage that has somehow become visible again, a symptom of bugs which is also possible. It's easy to use amcheck. Maybe there is another bug that is similar to #15309, but we need to establish if the index is truly corrupt, and what it looks like. It would also be good to use contrib/pageinspect to look at any blocks that amcheck shows in an error message. -- Peter Geoghegan
Re: BUG #15528: on v11.0 version still get error "ERROR: catalog ismissing 1 attribute(s) for relid 6855092"
From
Peter Geoghegan
Date:
On Fri, Nov 30, 2018 at 9:52 PM Peter Geoghegan <pg@bowt.ie> wrote: > Using amcheck would let us know if the index is completely corrupt > (e.g. tuples are in the wrong order), or if the duplicate is garbage > that has somehow become visible again, a symptom of bugs which is also > possible. It's easy to use amcheck. Maybe there is another bug that is > similar to #15309, but we need to establish if the index is truly > corrupt, and what it looks like. It would also be good to use > contrib/pageinspect to look at any blocks that amcheck shows in an > error message. Also, can you confirm that the problem goes away when max_parallel_maintenance_workers is set to 0? That's what we saw with bug #15309. -- Peter Geoghegan
Re: BUG #15528: on v11.0 version still get error "ERROR: catalog is missing 1 attribute(s) for relid 6855092"
From
"小威"
Date:
when set max_parallel_maintenance_workers = 0,no issue(vacuum full ok)now.before it my 2 times test were all failed.now I set it to 2 again and set log_min_message = debug5,log_min_error_statement=debug5,try to get more details.
On Fri, Nov 30, 2018 at 9:52 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Using amcheck would let us know if the index is completely corrupt
> (e.g. tuples are in the wrong order), or if the duplicate is garbage
> that has somehow become visible again, a symptom of bugs which is also
> possible. It's easy to use amcheck. Maybe there is another bug that is
> similar to #15309, but we need to establish if the index is truly
> corrupt, and what it looks like. It would also be good to use
> contrib/pageinspect to look at any blocks that amcheck shows in an
> error message.
Also, can you confirm that the problem goes away when
max_parallel_maintenance_workers is set to 0? That's what we saw with
bug #15309.
--
Peter Geoghegan
---Original---
From: "Peter Geoghegan"<pg@bowt.ie>
Date: Sat, Dec 1, 2018 14:14 PM
To: "zxw110876189"<zxw110876189@qq.com>;
Cc: "PostgreSQL mailing lists"<pgsql-bugs@lists.postgresql.org>;
Subject: Re: BUG #15528: on v11.0 version still get error "ERROR: catalog is missing 1 attribute(s) for relid 6855092"
On Fri, Nov 30, 2018 at 9:52 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Using amcheck would let us know if the index is completely corrupt
> (e.g. tuples are in the wrong order), or if the duplicate is garbage
> that has somehow become visible again, a symptom of bugs which is also
> possible. It's easy to use amcheck. Maybe there is another bug that is
> similar to #15309, but we need to establish if the index is truly
> corrupt, and what it looks like. It would also be good to use
> contrib/pageinspect to look at any blocks that amcheck shows in an
> error message.
Also, can you confirm that the problem goes away when
max_parallel_maintenance_workers is set to 0? That's what we saw with
bug #15309.
--
Peter Geoghegan
Re: BUG #15528: on v11.0 version still get error "ERROR: catalog ismissing 1 attribute(s) for relid 6855092"
From
Peter Geoghegan
Date:
On Mon, Dec 3, 2018 at 6:48 PM 小威 <zxw110876189@qq.com> wrote: > when set max_parallel_maintenance_workers = 0,no issue(vacuum full ok)now.before it my 2 times test were all failed.nowI set it to 2 again and set log_min_message = debug5,log_min_error_statement=debug5,try to get more details. That would be helpful. I also encourage you to install contrib/amcheck, and perform the verification I asked about. amcheck is not a third party extension -- it's maintained as part of the PostgreSQL codebase. You probably just have to install a separate OS package, and then you can "CREATE EXTENSION amcheck" from psql. The package that you need is probably called something like "postgresql-contrib-11", and will almost certainly be available from the same package repository that the server (e.g. "postgresql-11") came from. It should be easy for you. Thank you -- Peter Geoghegan