Re: [HACKERS] per-sesson errors after interrupting CLUSTERpg_attribute (not attrdef) - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: [HACKERS] per-sesson errors after interrupting CLUSTERpg_attribute (not attrdef)
Date
Msg-id 20180320180357.GE24022@telsasoft.com
Whole thread Raw
In response to Re: [HACKERS] per-sesson errors after interrupting CLUSTER pg_attrdef  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: [HACKERS] per-sesson errors after interrupting CLUSTERpg_attribute (not attrdef)  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On Tue, Oct 24, 2017 at 04:56:27PM -0700, Michael Paquier wrote:
> On Fri, Oct 20, 2017 at 9:01 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > This was briefly scary but seems to have been limited to my psql session (no
> > other errors logged).  Issue with catcache (?)
> >
> > I realized that the backup job I'd kicked off was precluding the CLUSTER from
> > running, but that CLUSTER was still holding lock and stalling everything else
> > under the sun.
> >
> > ts=# \df qci_add*
> > ERROR:  could not read block 8 in file "base/16400/999225102": read only 0 of 8192 bytes
> > ts=# \dt+ pg_att
> >
> > ts=# \dt+ pg_attrdef
> > ERROR:  could not read block 8 in file "base/16400/999225102": read only 0 of 8192 bytes
> 
> Perhaps that's too late, but to which relation is this relfilenode
> actually referring to?

Sorry for the late response :)

..but I've had mixed success reproducign this, and wasn't sure if it even an
issue under current postgres (the original issue was a 9.6 server+10.0client).

Now I know, this is still an issue under PG10.2:

[pryzbyj@united-telsasoft ~]$ psql ts
psql (10.2)
Type "help" for help.

ts=# CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
INFO:  clustering "pg_catalog.pg_attribute" using sequential scan and sort
INFO:  "pg_attribute": found 18074 removable, 2488511 nonremovable row versions in 80769 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU: user: 9.66 s, system: 1.92 s, elapsed: 22.29 s.
^CCancel request sent
ERROR:  canceling statement due to user request
ts=# CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
INFO:  clustering "pg_catalog.pg_attribute" using sequential scan and sort
INFO:  "pg_attribute": found 18102 removable, 2488511 nonremovable row versions in 80769 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU: user: 10.00 s, system: 1.59 s, elapsed: 27.60 s.
^CCancel request sent
ERROR:  canceling statement due to user request
ts=# CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
ERROR:  could not open file "base/16400/948150297": No such file or directory
ts=# SELECT * FROM pg_attribute WHERE pg_relation_filenode=948150297;
ERROR:  column "pg_relation_filenode" does not exist
LINE 1: SELECT * FROM pg_attribute WHERE pg_relation_filenode=948150...
                                         ^
ts=# SELECT * FROM pg_attribute WHERE pg_relation_filenode()=948150297;
ERROR:  function pg_relation_filenode() does not exist
LINE 1: SELECT * FROM pg_attribute WHERE pg_relation_filenode()=9481...
                                         ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
ts=# SELECT * FROM pg_class WHERE pg_relation_filenode(oid)=948150297;
ERROR:  could not open file "base/16400/948150297": No such file or directory
ts=# 

In another session:
ts=# SELECT * FROM pg_class WHERE pg_relation_filenode(oid)=948150297;
 relname | relnamespace | reltype | reloftype | relowner | relam | relfilenode | reltablespace | relpages | reltuples |
relallvisible| reltoastrelid | relhasindex | relisshared | relpersistence | relkind | relnatts | relchecks | relhasoids
|rel
 
haspkey | relhasrules | relhastriggers | relhassubclass | relrowsecurity | relforcerowsecurity | relispopulated |
relreplident| relispartition | relfrozenxid | relminmxid | relacl | reloptions | relpartbound
 

---------+--------------+---------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+------------+----

--------+-------------+----------------+----------------+----------------+---------------------+----------------+--------------+----------------+--------------+------------+--------+------------+--------------
(0 rows)

postgres=# SELECT session_line, message, query FROM postgres_log_2018_03_20_1200 WHERE pid=29855 AND
session_id='5ab147d9.749f'ORDER BY 1;
 
session_line|message|query
1|statement: SELECT pg_catalog.quote_ident(c2.relname)   FROM pg_catalog.pg_class c1, pg_catalog.pg_class c2,
pg_catalog.pg_indexi WHERE c1.oid=i.indrelid and i.indexrelid=c2.oid       and (0 = pg_catalog.length(''))       and
pg_catalog.quote_ident(c1.relname)='pg_attribute'      and pg_catalog.pg_table_is_visible(c2.oid)
 
LIMIT 1000|
2|statement: SELECT pg_catalog.quote_ident(c2.relname)   FROM pg_catalog.pg_class c1, pg_catalog.pg_class c2,
pg_catalog.pg_indexi WHERE c1.oid=i.indrelid and i.indexrelid=c2.oid       and (24 =
pg_catalog.length('pg_attribute_relid_attnu'))      and pg_catalog.quote_ident(c1.relname)='pg_attribute'       and
pg_catalog.pg_table_is_visible(c2.oid)
LIMIT 1000|
3|statement: CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;|
4|clustering "pg_catalog.pg_attribute" using sequential scan and sort|
5|temporary file: path "base/pgsql_tmp/pgsql_tmp29855.0", size 374013952|CLUSTER VERBOSE pg_attribute USING
pg_attribute_relid_attnum_index;
 
6|"pg_attribute": found 18074 removable, 2488511 nonremovable row versions in 80769 pages|
7|canceling statement due to user request|CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
8|statement: CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;|
9|clustering "pg_catalog.pg_attribute" using sequential scan and sort|
10|temporary file: path "base/pgsql_tmp/pgsql_tmp29855.1", size 374013952|CLUSTER VERBOSE pg_attribute USING
pg_attribute_relid_attnum_index;
 
11|"pg_attribute": found 18102 removable, 2488511 nonremovable row versions in 80769 pages|
12|canceling statement due to user request|CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;
13|statement: CLUSTER VERBOSE pg_attribute USING pg_attribute_relid_attnum_index ;|
14|could not open file "base/16400/948150297": No such file or directory|CLUSTER VERBOSE pg_attribute USING
pg_attribute_relid_attnum_index;
 
15|statement: SELECT * FROM pg_attribute WHERE pg_relation_filenode=948150297;|
16|column "pg_relation_filenode" does not exist|SELECT * FROM pg_attribute WHERE pg_relation_filenode=948150297;
17|statement: SELECT * FROM pg_attribute WHERE pg_relation_filenode()=948150297;|
18|function pg_relation_filenode() does not exist|SELECT * FROM pg_attribute WHERE pg_relation_filenode()=948150297;
19|statement: SELECT * FROM pg_class WHERE pg_relation_filenode(oid)=948150297;|
20|could not open file "base/16400/948150297": No such file or directory|SELECT * FROM pg_class WHERE
pg_relation_filenode(oid)=948150297;
(20 rows)

I believe lines 1-2 were tab completion, which was also present in my original report.

In my original report, I believe \df wasn't working (or maybe tab completion
for a function?).  This is different, as the CLUSTER isn't working.

Also note the subject of my original report was wrong (pg_attrdef not pg_attribute).

I'll leave the broken session opened for awhile in cases someone has questions
about it other than relfilenode, which apparently isn't working..

I don't know much but..is this related to pg_filenode.map?

Justin


pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] proposal: schema variables
Next
From: Tom Lane
Date:
Subject: Re: configure's checks for --enable-tap-tests are insufficient