Thread: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Maxim Boguk
Date:
One of servers under my support 2 days ago produced the next error:
ERROR: could not read block 135 in file "base/16404/118881486": read only 0 of 8192 bytes
Server version 9.0.6. No db or server crashes or db recovery happen on that server since setup.
Server is db backend for very large web project under quite a heavy load (10k+ request per second to db).
No other strange errors found in the logs since server was put to production half year ago.
Now what happened (from DB log):
2012-02-17 22:35:58 MSK 14333 [vxid:340/18229999 txid:2341883282] [DELETE] LOG: duration: 5.669 ms execute <unnamed>: delete from agency_statistics
2012-02-17 22:35:58 MSK 14333 [vxid:340/18230000 txid:0] [BIND] LOG: duration: 2.787 ms bind <unnamed>: insert into agency_statistics (employer_id, area_id, area_pa
th, professional_area_id, vacancies_number) select e.employer_id, vb.area_id, a.path, s.professional_area_id, count(distinct v.vacancy_id) from vacancy v inner join employer e on v.employer
_id = e.employer_id inner join vacancy_body vb on v.vacancy_body_id = vb.vacancy_body_id inner join vacancy_body_specialization vs on vb.vacancy_body_id = vs.vacancy_body_id inner join spec
ialization s on s.specialization_id = vs.specialization_id inner join area a on vb.area_id = a.area_id where v.archived_status in (0) and v.disabled = false and e.category = $1 and e.state
= $2 and e.manager_id > 0 group by e.employer_id, vb.area_id, a.path, s.professional_area_id
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230000 txid:2341883336] [INSERT] LOG: duration: 726.700 ms execute <unnamed>: insert into agency_statistics (employer_id,
area_id, area_path, professional_area_id, vacancies_number) select e.employer_id, vb.area_id, a.path, s.professional_area_id, count(distinct v.vacancy_id) from vacancy v inner join employer
e on v.employer_id = e.employer_id inner join vacancy_body vb on v.vacancy_body_id = vb.vacancy_body_id inner join vacancy_body_specialization vs on vb.vacancy_body_id = vs.vacancy_body_id
inner join specialization s on s.specialization_id = vs.specialization_id inner join area a on vb.area_id = a.area_id where v.archived_status in (0) and v.disabled = false and e.category =
$1 and e.state = $2 and e.manager_id > 0 group by e.employer_id, vb.area_id, a.path, s.professional_area_id
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230070 txid:0] [BIND] LOG: duration: 30.195 ms bind S_35: insert into agency_statistics (employer_id, area_id, area_path,
professional_area_id, vacancies_number, rank) values ($1, $2, $3, $4, $5, 0)
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230082 txid:2341883427] [INSERT] ERROR: could not read block 135 in file "base/16404/118881486": read only 0 of 8192 bytes
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230082 txid:2341883427] [INSERT] STATEMENT: insert into agency_statistics (employer_id, area_id, area_path, professional_area_id, vacancies_number, rank) values ($1, $2, $3, $4, $5, 0)
No concurrent writes was happened to that table outside of that connection.
May be autovacuum were fired on that table between delete and first or second inserts (unfortunately it wasn't logged).
I recreated table from scratch and keep the damaged table under another name (through alter table agency_statistics rename to agency_statistics_old).
So I have files to dig into.
What I see in file system:
hh=# SELECT relfilenode from pg_class where relname='agency_statistics_old';
relfilenode
-------------
118881486
postgres@db10:~/tmp$ ls -la /var/lib/postgresql/9.0/main/base/16404/118881486
-rw------- 1 postgres postgres 0 2012-02-20 12:04 /var/lib/postgresql/9.0/main/base/16404/118881486
So table file size zero bytes (seems autovacuum truncated that table to 0 bytes).
But index show:
hh=# SELECT relfilenode from pg_class where relname='agency_statistics_pkey';
relfilenode
-------------
118881489
postgres@db10:~/tmp$ ls -la /var/lib/postgresql/9.0/main/base/16404/118881489
-rw------- 1 postgres postgres 4956160 2012-02-20 12:07 /var/lib/postgresql/9.0/main/base/16404/118881489
hh=# SELECT * from pgstatindex('agency_statistics_pkey');
-[ RECORD 1 ]------+--------
version | 2
tree_level | 2
index_size | 4947968
root_block_no | 295
internal_pages | 1
leaf_pages | 601
empty_pages | 0
deleted_pages | 1
avg_leaf_density | 0.45
leaf_fragmentation | 13.14
I out of ideas now.
Any suggestions where and what I should look next?
Kind Regards,
Maksym
--
Maxim Boguk
Senior Postgresql DBA.
Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
LinkedIn profile: http://nz.linkedin.com/in/maximboguk
"If they can send one man to the moon... why can't they send them all?"
МойКруг: http://mboguk.moikrug.ru/
"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."
ERROR: could not read block 135 in file "base/16404/118881486": read only 0 of 8192 bytes
Server version 9.0.6. No db or server crashes or db recovery happen on that server since setup.
Server is db backend for very large web project under quite a heavy load (10k+ request per second to db).
No other strange errors found in the logs since server was put to production half year ago.
Now what happened (from DB log):
2012-02-17 22:35:58 MSK 14333 [vxid:340/18229999 txid:2341883282] [DELETE] LOG: duration: 5.669 ms execute <unnamed>: delete from agency_statistics
2012-02-17 22:35:58 MSK 14333 [vxid:340/18230000 txid:0] [BIND] LOG: duration: 2.787 ms bind <unnamed>: insert into agency_statistics (employer_id, area_id, area_pa
th, professional_area_id, vacancies_number) select e.employer_id, vb.area_id, a.path, s.professional_area_id, count(distinct v.vacancy_id) from vacancy v inner join employer e on v.employer
_id = e.employer_id inner join vacancy_body vb on v.vacancy_body_id = vb.vacancy_body_id inner join vacancy_body_specialization vs on vb.vacancy_body_id = vs.vacancy_body_id inner join spec
ialization s on s.specialization_id = vs.specialization_id inner join area a on vb.area_id = a.area_id where v.archived_status in (0) and v.disabled = false and e.category = $1 and e.state
= $2 and e.manager_id > 0 group by e.employer_id, vb.area_id, a.path, s.professional_area_id
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230000 txid:2341883336] [INSERT] LOG: duration: 726.700 ms execute <unnamed>: insert into agency_statistics (employer_id,
area_id, area_path, professional_area_id, vacancies_number) select e.employer_id, vb.area_id, a.path, s.professional_area_id, count(distinct v.vacancy_id) from vacancy v inner join employer
e on v.employer_id = e.employer_id inner join vacancy_body vb on v.vacancy_body_id = vb.vacancy_body_id inner join vacancy_body_specialization vs on vb.vacancy_body_id = vs.vacancy_body_id
inner join specialization s on s.specialization_id = vs.specialization_id inner join area a on vb.area_id = a.area_id where v.archived_status in (0) and v.disabled = false and e.category =
$1 and e.state = $2 and e.manager_id > 0 group by e.employer_id, vb.area_id, a.path, s.professional_area_id
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230070 txid:0] [BIND] LOG: duration: 30.195 ms bind S_35: insert into agency_statistics (employer_id, area_id, area_path,
professional_area_id, vacancies_number, rank) values ($1, $2, $3, $4, $5, 0)
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230082 txid:2341883427] [INSERT] ERROR: could not read block 135 in file "base/16404/118881486": read only 0 of 8192 bytes
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230082 txid:2341883427] [INSERT] STATEMENT: insert into agency_statistics (employer_id, area_id, area_path, professional_area_id, vacancies_number, rank) values ($1, $2, $3, $4, $5, 0)
No concurrent writes was happened to that table outside of that connection.
May be autovacuum were fired on that table between delete and first or second inserts (unfortunately it wasn't logged).
I recreated table from scratch and keep the damaged table under another name (through alter table agency_statistics rename to agency_statistics_old).
So I have files to dig into.
What I see in file system:
hh=# SELECT relfilenode from pg_class where relname='agency_statistics_old';
relfilenode
-------------
118881486
postgres@db10:~/tmp$ ls -la /var/lib/postgresql/9.0/main/base/16404/118881486
-rw------- 1 postgres postgres 0 2012-02-20 12:04 /var/lib/postgresql/9.0/main/base/16404/118881486
So table file size zero bytes (seems autovacuum truncated that table to 0 bytes).
But index show:
hh=# SELECT relfilenode from pg_class where relname='agency_statistics_pkey';
relfilenode
-------------
118881489
postgres@db10:~/tmp$ ls -la /var/lib/postgresql/9.0/main/base/16404/118881489
-rw------- 1 postgres postgres 4956160 2012-02-20 12:07 /var/lib/postgresql/9.0/main/base/16404/118881489
hh=# SELECT * from pgstatindex('agency_statistics_pkey');
-[ RECORD 1 ]------+--------
version | 2
tree_level | 2
index_size | 4947968
root_block_no | 295
internal_pages | 1
leaf_pages | 601
empty_pages | 0
deleted_pages | 1
avg_leaf_density | 0.45
leaf_fragmentation | 13.14
I out of ideas now.
Any suggestions where and what I should look next?
Kind Regards,
Maksym
--
Maxim Boguk
Senior Postgresql DBA.
Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
LinkedIn profile: http://nz.linkedin.com/in/maximboguk
"If they can send one man to the moon... why can't they send them all?"
МойКруг: http://mboguk.moikrug.ru/
"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Tom Lane
Date:
Maxim Boguk <maxim.boguk@gmail.com> writes: > One of servers under my support 2 days ago produced the next error: > ERROR: could not read block 135 in file "base/16404/118881486": read only > 0 of 8192 bytes > ... > What I see in file system: > hh=# SELECT relfilenode from pg_class where relname='agency_statistics_old'; > relfilenode > ------------- > 118881486 > postgres@db10:~/tmp$ ls -la > /var/lib/postgresql/9.0/main/base/16404/118881486 > -rw------- 1 postgres postgres 0 2012-02-20 12:04 > /var/lib/postgresql/9.0/main/base/16404/118881486 > So table file size zero bytes (seems autovacuum truncated that table to 0 > bytes). Hmmm .... something did, but I see no clear evidence that it was autovacuum. Do you know why the mod date on the file is 2012-02-20 12:04? That's more than two days after the error in your logs, so it's not clear to me that the current state of the file tells us much about what happened on the 17th. If autovacuum had truncated the table then, and the table wasn't touched otherwise, the file mod date shouldn't have increased. regards, tom lane
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Adrian Klaver
Date:
On Monday, February 20, 2012 5:32:01 pm Tom Lane wrote: > Maxim Boguk <maxim.boguk@gmail.com> writes: > > One of servers under my support 2 days ago produced the next error: > > ERROR: could not read block 135 in file "base/16404/118881486": read > > only 0 of 8192 bytes > > ... > > What I see in file system: > > hh=# SELECT relfilenode from pg_class where > > relname='agency_statistics_old'; > > > > relfilenode > > > > ------------- > > > > 118881486 > > > > postgres@db10:~/tmp$ ls -la > > /var/lib/postgresql/9.0/main/base/16404/118881486 > > -rw------- 1 postgres postgres 0 2012-02-20 12:04 > > /var/lib/postgresql/9.0/main/base/16404/118881486 > > > > So table file size zero bytes (seems autovacuum truncated that table to 0 > > bytes). > > Hmmm .... something did, but I see no clear evidence that it was > autovacuum. > > Do you know why the mod date on the file is 2012-02-20 12:04? That's > more than two days after the error in your logs, so it's not clear to me > that the current state of the file tells us much about what happened on > the 17th. If autovacuum had truncated the table then, and the table > wasn't touched otherwise, the file mod date shouldn't have increased. If I am following correctly that is due to this: "I recreated table from scratch and keep the damaged table under another name (through alter table agency_statistics rename to agency_statistics_old). So I have files to dig into." > > regards, tom lane -- Adrian Klaver adrian.klaver@gmail.com
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Maxim Boguk
Date:
Hmmm .... something did, but I see no clear evidence that it was
> So table file size zero bytes (seems autovacuum truncated that table to 0
> bytes).
autovacuum.
Do you know why the mod date on the file is 2012-02-20 12:04? That's
more than two days after the error in your logs, so it's not clear to me
that the current state of the file tells us much about what happened on
the 17th. If autovacuum had truncated the table then, and the table
wasn't touched otherwise, the file mod date shouldn't have increased.
regards, tom lane
Hi,
Cron was attempt to populate the table once per hour after that problem happened.
And each time it was produced the same error.
So table had attempted write activity after error happen:
(
2012-02-18 00:36:01 MSK 27743 hh.app@hh from [local] [vxid:95/19965879 txid:2342867908] [INSERT] ERROR: could not read block 171 in file "base/16404/118881486": read only 0 of 8192 bytes
2012-02-18 00:36:01 MSK 27743 hh.app@hh from [local] [vxid:95/19965879 txid:2342867908] [INSERT] STATEMENT: insert into agency_statistics (employer_id, area_id, area_path, professional_area_id, vacancies_number) select e.employer_id, vb.area_id, a.path, s.professional_area_id, count(distinct v.vacancy_id) from vacancy v inner join employer e on v.employer_id = e.employer_id inner join vacancy_body vb on v.vacancy_body_id = vb.vacancy_body_id inner join vacancy_body_specialization vs on vb.vacancy_body_id = vs.vacancy_body_id inner join specialization s on s.specialization_id = vs.specialization_id inner join area a on vb.area_id = a.area_id where v.archived_status in (0) and v.disabled = false and e.category = $1 and e.state = $2 and e.manager_id > 0 group by e.employer_id, vb.area_id, a.path, s.professional_area_id
...
and so on until Monday when I was informed about that problem.
).
For information the table has quite simple structure without any unusual features:
hh=# \d+ agency_statistics_old
Table "public.agency_statistics_old"
Column | Type | Modifiers | Storage | Description
----------------------+------------------------+-----------+----------+-------------
employer_id | integer | not null | plain |
area_id | integer | not null | plain |
area_path | character varying(255) | not null | extended |
professional_area_id | integer | not null | plain |
vacancies_number | integer | not null | plain |
rank | integer | | plain |
normalised_rank | integer | | plain |
Indexes:
"agency_statistics_pkey" PRIMARY KEY, btree (employer_id, area_id, professional_area_id) CLUSTER
Has OIDs: no
Options: fillfactor=50
I almost sure I will get an error if I try insert something to the agency_statistics_old.
Should I perform that test?
Kind Regards,
Maxim
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Tom Lane
Date:
Maxim Boguk <maxim.boguk@gmail.com> writes: >> Do you know why the mod date on the file is 2012-02-20 12:04? > Cron was attempt to populate the table once per hour after that problem > happened. > And each time it was produced the same error. That's interesting ... is there any possibility that the insertions were attempting to insert values that matched a previously-existing primary key value? I'm thinking there's no reason for the INSERT per se to be touching nonexistent blocks, but if for some reason the pkey index still had entries pointing at vanished rows (as it seems to) then the errors could be coming from uniqueness checks attempting to fetch those rows to see if they're live. regards, tom lane
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Maxim Boguk
Date:
On Tue, Feb 21, 2012 at 1:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Maxim Boguk <maxim.boguk@gmail.com> writes:
>> Do you know why the mod date on the file is 2012-02-20 12:04?> Cron was attempt to populate the table once per hour after that problemThat's interesting ... is there any possibility that the insertions were
> happened.
> And each time it was produced the same error.
attempting to insert values that matched a previously-existing primary
key value? I'm thinking there's no reason for the INSERT per se to be
touching nonexistent blocks, but if for some reason the pkey index still
had entries pointing at vanished rows (as it seems to) then the errors
could be coming from uniqueness checks attempting to fetch those rows to
see if they're live.
regards, tom lane
Hi,
There isn't possibility but close to 100% new inserted values were matched a previously-existing primary
key value.
The table is hand-made 'materialyzed view'-type statistic table which is getting recalculated via cron.
--
Maxim Boguk
Senior Postgresql DBA.
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Rob Sargentg
Date:
On 02/20/2012 07:46 PM, Maxim Boguk wrote:
Trying to follow along here: Are you say the inserted record DO or DO NOT match previously existing key values.
Sorry Maxim,On Tue, Feb 21, 2012 at 1:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Maxim Boguk <maxim.boguk@gmail.com> writes:
>> Do you know why the mod date on the file is 2012-02-20 12:04?> Cron was attempt to populate the table once per hour after that problemThat's interesting ... is there any possibility that the insertions were
> happened.
> And each time it was produced the same error.
attempting to insert values that matched a previously-existing primary
key value? I'm thinking there's no reason for the INSERT per se to be
touching nonexistent blocks, but if for some reason the pkey index still
had entries pointing at vanished rows (as it seems to) then the errors
could be coming from uniqueness checks attempting to fetch those rows to
see if they're live.
regards, tom lane
Hi,
There isn't possibility but close to 100% new inserted values were matched a previously-existing primary
key value.
The table is hand-made 'materialyzed view'-type statistic table which is getting recalculated via cron.
--
Maxim Boguk
Senior Postgresql DBA.
Trying to follow along here: Are you say the inserted record DO or DO NOT match previously existing key values.
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Maxim Boguk
Date:
On Tue, Feb 21, 2012 at 1:46 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote:
Hi,On Tue, Feb 21, 2012 at 1:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Maxim Boguk <maxim.boguk@gmail.com> writes:
>> Do you know why the mod date on the file is 2012-02-20 12:04?> Cron was attempt to populate the table once per hour after that problemThat's interesting ... is there any possibility that the insertions were
> happened.
> And each time it was produced the same error.
attempting to insert values that matched a previously-existing primary
key value? I'm thinking there's no reason for the INSERT per se to be
touching nonexistent blocks, but if for some reason the pkey index still
had entries pointing at vanished rows (as it seems to) then the errors
could be coming from uniqueness checks attempting to fetch those rows to
see if they're live.
regards, tom lane
There isn't possibility but close to 100% new inserted values were matched a previously-existing primary
key value.
The table is hand-made 'materialyzed view'-type statistic table which is getting recalculated via cron.
To be clear - the new inserted values do match a previously-existing primary key values almost always.
Sorry for not being clear.
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Tom Lane
Date:
Maxim Boguk <maxim.boguk@gmail.com> writes: > On Tue, Feb 21, 2012 at 1:46 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote: >> There isn't possibility but close to 100% new inserted values were matched >> a previously-existing primary >> key value. >> The table is hand-made 'materialyzed view'-type statistic table which is >> getting recalculated via cron. > To be clear - the new inserted values do match a previously-existing > primary key values almost always. OK, so that pretty much explains where the visible symptoms are coming from: somehow, the table got truncated but its pkey index did not get cleared out. So an insert creates an empty page zero, inserts a heap tuple there, tries to insert an index entry. The btree code sees there is an index entry for that key already, and tries to fetch the heap tuple for that index entry to see if it's dead (which would allow the insertion to proceed). But the block number the index is pointing at isn't there, so you get the quoted error message. The insertion rolls back, leaving a dead tuple that can be garbage-collected by autovacuum, after which it truncates the table again --- but of course without removing any index entries, except maybe one for TID (0,1) if that's still there. Lather rinse repeat. But this brings us little closer to understanding the cause of the problem. How could the table have gotten truncated without vacuuming out the index? Are you sure the only operations happening on that table are INSERT, DELETE, autovacuum? No DDL of any kind? regards, tom lane
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Tom Lane
Date:
I wrote: > OK, so that pretty much explains where the visible symptoms are coming > from: somehow, the table got truncated but its pkey index did not get > cleared out. So an insert creates an empty page zero, inserts a heap > tuple there, tries to insert an index entry. The btree code sees there > is an index entry for that key already, and tries to fetch the heap > tuple for that index entry to see if it's dead (which would allow the > insertion to proceed). But the block number the index is pointing at > isn't there, so you get the quoted error message. The insertion rolls > back, leaving a dead tuple that can be garbage-collected by autovacuum, > after which it truncates the table again --- but of course without > removing any index entries, except maybe one for TID (0,1) if that's > still there. Lather rinse repeat. Hmm ... actually there is a point that this theory doesn't explain entirely. If the probability of a collision with an existing index entry was near 100%, then each hourly cron job should only have been able to insert one or a few heap tuples before failing. That would not trigger an autovacuum right away. Eventually the number of dead tuples would build up to the point where autovacuum got interested, but it strains credulity a bit to assume that this happened exactly after the last hourly run before you renamed the table. Yet, if that didn't happen just that way, how come the size of the table is exactly zero now? The theory would be more satisfactory if we could expect that an hourly run would be able to insert some thousands of tuples before failing, enough to trigger an autovacuum run. So I'm wondering if maybe the index is *partially* cleaned out, but not completely. Does this materialized view have a fairly predictable number of rows, and if so how does that compare to the number of entries in the index? (If you have no other way to get the number of entries in the index, try inserting a dummy row, deleting it, and then VACUUM VERBOSE.) regards, tom lane
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Maxim Boguk
Date:
OK, so that pretty much explains where the visible symptoms are coming
from: somehow, the table got truncated but its pkey index did not get
cleared out. So an insert creates an empty page zero, inserts a heap
tuple there, tries to insert an index entry. The btree code sees there
is an index entry for that key already, and tries to fetch the heap
tuple for that index entry to see if it's dead (which would allow the
insertion to proceed). But the block number the index is pointing at
isn't there, so you get the quoted error message. The insertion rolls
back, leaving a dead tuple that can be garbage-collected by autovacuum,
after which it truncates the table again --- but of course without
removing any index entries, except maybe one for TID (0,1) if that's
still there. Lather rinse repeat.
But this brings us little closer to understanding the cause of the
problem. How could the table have gotten truncated without vacuuming
out the index?
Are you sure the only operations happening on that table are INSERT,
DELETE, autovacuum? No DDL of any kind?
regards, tom lane
Not what I know. There are could be some updates as well, but they performed by the same cron as initial delete and they doesn't happened in that case because the cron script exit on the first database error. E.g. no one in my knowledge could write into that table except that single cron and autovacuum.
And I have logged of the all queries with duration over 1ms.
Kind Regards,
Maksym
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Maxim Boguk
Date:
On Tue, Feb 21, 2012 at 3:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:Hmm ... actually there is a point that this theory doesn't explain
> OK, so that pretty much explains where the visible symptoms are coming
> from: somehow, the table got truncated but its pkey index did not get
> cleared out. So an insert creates an empty page zero, inserts a heap
> tuple there, tries to insert an index entry. The btree code sees there
> is an index entry for that key already, and tries to fetch the heap
> tuple for that index entry to see if it's dead (which would allow the
> insertion to proceed). But the block number the index is pointing at
> isn't there, so you get the quoted error message. The insertion rolls
> back, leaving a dead tuple that can be garbage-collected by autovacuum,
> after which it truncates the table again --- but of course without
> removing any index entries, except maybe one for TID (0,1) if that's
> still there. Lather rinse repeat.
entirely. If the probability of a collision with an existing index
entry was near 100%, then each hourly cron job should only have been
able to insert one or a few heap tuples before failing. That would
not trigger an autovacuum right away. Eventually the number of dead
tuples would build up to the point where autovacuum got interested,
but it strains credulity a bit to assume that this happened exactly
after the last hourly run before you renamed the table. Yet, if that
didn't happen just that way, how come the size of the table is exactly
zero now?
The theory would be more satisfactory if we could expect that an hourly
run would be able to insert some thousands of tuples before failing,
enough to trigger an autovacuum run. So I'm wondering if maybe the
index is *partially* cleaned out, but not completely. Does this
materialized view have a fairly predictable number of rows, and if so
how does that compare to the number of entries in the index? (If you
have no other way to get the number of entries in the index, try
inserting a dummy row, deleting it, and then VACUUM VERBOSE.)
regards, tom lane
There is some funny results:
hh=# VACUUM verbose agency_statistics_old;
INFO: vacuuming "public.agency_statistics_old"
INFO: index "agency_statistics_pkey" now contains 0 row versions in 605 pages
DETAIL: 0 index row versions were removed.
595 index pages have been deleted, 595 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO: "agency_statistics_old": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
VACUUM
However when I try populate that table with production data I get an error:
hh=# insert into agency_statistics_old select * from agency_statistics;
ERROR: could not read block 228 in file "base/16404/118881486": read only 0 of 8192 bytes
E.g. the database see that index have zero rows, but an insert still fail.
May be I should use pageinspect addon to see an actual index pages content?
--
Maxim Boguk
Senior Postgresql DBA.
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Maxim Boguk
Date:
On Tue, Feb 21, 2012 at 3:47 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote:
There is some funny results:On Tue, Feb 21, 2012 at 3:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:I wrote:Hmm ... actually there is a point that this theory doesn't explain
> OK, so that pretty much explains where the visible symptoms are coming
> from: somehow, the table got truncated but its pkey index did not get
> cleared out. So an insert creates an empty page zero, inserts a heap
> tuple there, tries to insert an index entry. The btree code sees there
> is an index entry for that key already, and tries to fetch the heap
> tuple for that index entry to see if it's dead (which would allow the
> insertion to proceed). But the block number the index is pointing at
> isn't there, so you get the quoted error message. The insertion rolls
> back, leaving a dead tuple that can be garbage-collected by autovacuum,
> after which it truncates the table again --- but of course without
> removing any index entries, except maybe one for TID (0,1) if that's
> still there. Lather rinse repeat.
entirely. If the probability of a collision with an existing index
entry was near 100%, then each hourly cron job should only have been
able to insert one or a few heap tuples before failing. That would
not trigger an autovacuum right away. Eventually the number of dead
tuples would build up to the point where autovacuum got interested,
but it strains credulity a bit to assume that this happened exactly
after the last hourly run before you renamed the table. Yet, if that
didn't happen just that way, how come the size of the table is exactly
zero now?
The theory would be more satisfactory if we could expect that an hourly
run would be able to insert some thousands of tuples before failing,
enough to trigger an autovacuum run. So I'm wondering if maybe the
index is *partially* cleaned out, but not completely. Does this
materialized view have a fairly predictable number of rows, and if so
how does that compare to the number of entries in the index? (If you
have no other way to get the number of entries in the index, try
inserting a dummy row, deleting it, and then VACUUM VERBOSE.)
regards, tom lane
hh=# VACUUM verbose agency_statistics_old;
INFO: vacuuming "public.agency_statistics_old"
INFO: index "agency_statistics_pkey" now contains 0 row versions in 605 pages
DETAIL: 0 index row versions were removed.
595 index pages have been deleted, 595 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO: "agency_statistics_old": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
VACUUM
However when I try populate that table with production data I get an error:
hh=# insert into agency_statistics_old select * from agency_statistics;
ERROR: could not read block 228 in file "base/16404/118881486": read only 0 of 8192 bytes
E.g. the database see that index have zero rows, but an insert still fail.
May be I should use pageinspect addon to see an actual index pages content?
What makes Your idea about: "index is *partially* cleaned out, but not completely" highly probable that is the next query produce no error:
hh=# insert into agency_statistics_old select * from agency_statistics limit 1;
INSERT 0 1
--
Maxim Boguk
Senior Postgresql DBA.
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Tom Lane
Date:
Maxim Boguk <maxim.boguk@gmail.com> writes: > There is some funny results: > hh=# VACUUM verbose agency_statistics_old; > INFO: vacuuming "public.agency_statistics_old" > INFO: index "agency_statistics_pkey" now contains 0 row versions in 605 > pages > DETAIL: 0 index row versions were removed. Wow. That seems to blow my theory to small pieces. If the index contains no entries then it shouldn't be causing any uniqueness check probes. But at the same time, if the index is empty then how come pgstatindex showed avg_leaf_density = 0.45 ? > May be I should use pageinspect addon to see an actual index pages content? That or pg_filedump would be interesting. But your experiments with adding data from the other table will probably have produced some new index entries, which will confuse the situation. Did you save a physical copy of the index before that? Another idea is to attach to the backend with gdb, set a breakpoint at errfinish, and get a stack trace from the point of the "could not read block" error. That would show definitively if this is coming from a uniqueness check or something else entirely. regards, tom lane
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Maxim Boguk
Date:
On Tue, Feb 21, 2012 at 4:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Maxim Boguk <maxim.boguk@gmail.com> writes:Wow. That seems to blow my theory to small pieces. If the index
> There is some funny results:
> hh=# VACUUM verbose agency_statistics_old;
> INFO: vacuuming "public.agency_statistics_old"
> INFO: index "agency_statistics_pkey" now contains 0 row versions in 605
> pages
> DETAIL: 0 index row versions were removed.
contains no entries then it shouldn't be causing any uniqueness check
probes. But at the same time, if the index is empty then how come
pgstatindex showed avg_leaf_density = 0.45 ?That or pg_filedump would be interesting. But your experiments with
> May be I should use pageinspect addon to see an actual index pages content?
adding data from the other table will probably have produced some new
index entries, which will confuse the situation. Did you save a
physical copy of the index before that?
Another idea is to attach to the backend with gdb, set a breakpoint at
errfinish, and get a stack trace from the point of the "could not read
block" error. That would show definitively if this is coming from a
uniqueness check or something else entirely.
regards, tom lane
Yes I had saved a physical copy of the file before start playing with it.
Unfortunately that is a production server with no gcc and gdb available so pg_filedump or gdb yet (but I going to work on it).
While I waiting for gdb/gcc on that server I had built pg_filedump on the development server using same postgresql version and created pg_filedump of the index file.
It can be downloaded there: http://maximboguk.com/static/etc/agency_statistics_pkey.pg_filedump.gz
I have not enough knowledge of the b-tree index structure to extract any useful information from that file.
Kind Regards,
Maksym
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Tom Lane
Date:
Maxim Boguk <maxim.boguk@gmail.com> writes: > While I waiting for gdb/gcc on that server I had built pg_filedump on the > development server using same postgresql version and created pg_filedump of > the index file. > It can be downloaded there: > http://maximboguk.com/static/etc/agency_statistics_pkey.pg_filedump.gz If the index key values are not private information, could we see that with "pg_filedump -i -f" not just bare? regards, tom lane
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Maxim Boguk
Date:
On Tue, Feb 21, 2012 at 5:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Maxim Boguk <maxim.boguk@gmail.com> writes:If the index key values are not private information, could we see that
> While I waiting for gdb/gcc on that server I had built pg_filedump on the
> development server using same postgresql version and created pg_filedump of
> the index file.
> It can be downloaded there:
> http://maximboguk.com/static/etc/agency_statistics_pkey.pg_filedump.gz
with "pg_filedump -i -f" not just bare?
regards, tom lane
There is it:
http://maximboguk.com/static/etc/agency_statistics_pkey.pg_full_filedump.gz
--
Maxim Boguk
Senior Postgresql DBA.
Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
From
Tom Lane
Date:
Maxim Boguk <maxim.boguk@gmail.com> writes: > On Tue, Feb 21, 2012 at 5:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> If the index key values are not private information, could we see that >> with "pg_filedump -i -f" not just bare? > There is it: > http://maximboguk.com/static/etc/agency_statistics_pkey.pg_full_filedump.gz Thanks. The index definitely contains about a hundred apparently-live index entries, which shouldn't be there considering the table is empty, and it doesn't seem to be corrupted in any obvious way. However, I noticed that all the pages have btpo_cycleid zero, which seems a bit improbable given that there are some with BTP_SPLIT_END set, and the only way for that to become set is if there are adjacent index pages with different btpo_cycleid values. This led me to look for bugs that might reset btpo_cycleid incorrectly, and I believe I see one. _bt_delitems_delete has this code that was copied-and-pasted from its previous incarnation as _bt_delitems: /* * We can clear the vacuum cycle ID since this page has certainly been * processed by the current vacuum scan. */ opaque = (BTPageOpaque) PageGetSpecialPointer(page); opaque->btpo_cycleid = 0; Now that comment is a lie; this is called from _bt_findinsertloc not from vacuum. So the series of events I'm imagining is 1. Autovacuum starts. 2. A page gets split; the page in question is beyond where autovac has reached in its index scan, but a page from before that gets recycled to become the new right half. Both these pages get marked with the vacuum's cycleid. 3. Whatever's doing the insertions continues to insert into the same range of the index. (I notice that your INSERT query has a GROUP BY that uses the high columns of your pkey, which means it seems fairly likely that insertions occur in something close to index order. So this is not unlikely.) It hits a couple of duplicate index entries, finds they're dead, and hence marks them dead and sets the BTP_HAS_GARBAGE field on the new right-half page (we need to assume this happens because the split would've cleared that flag). 4. It keeps on inserting and eventually fills the the new right-half page. At this point, since BTP_HAS_GARBAGE is set, it runs _bt_delitems_delete and incorrectly clears btpo_cycleid on that page. 5. Finally, the concurrent vacuum reaches the originally split page. It notes the btpo_cycleid matching its cycleid and correctly realizes it has to go back and process the right-half page. However, when it reaches that page, it finds btpo_cycleid already zero and concludes it doesn't need to scan that page. Result: any entries that should have been removed, but were not already marked DEAD at the time _bt_delitems_delete ran, are not removed from the index. Now obviously I can't prove that this is exactly what happened in your index. It would have to have happened several times during a single vacuum run, I think, because the remaining index entries are spread across half a dozen index pages. However, all those pages are relatively early in the index --- the first such page is block 2, and the last one block 49, out of a 605-page index. And we know that your usage pattern is such that a whole lot of inserts would be going on at the same time that vacuum had been removing a whole lot of old entries (and hence causing low-numbered pages to get freed up and returned to the free space map). So it seems to fit. Anyway, clearing btpo_cycleid there is unquestionably wrong. regards, tom lane