Thread: BUG #7819: missing chunk number 0 for toast value 1235919 in pg_toast_35328

BUG #7819: missing chunk number 0 for toast value 1235919 in pg_toast_35328

From
pchan@contigo.com
Date:
The following bug has been logged on the website:

Bug reference:      7819
Logged by:          Pius Chan
Email address:      pchan@contigo.com
PostgreSQL version: 9.1.7
Operating system:   CentOS (Linux version 2.6.32-220
Description:        =


We just upgraded from PostgreSQL 9.1.3 to 9.1.7 last week. In our database
cluster, one database is used by the Java Message Service (JMS) system. One
table in the database "jms_messages", is inserted, updated and then deleted
frequently. For every five minutes, we have a cron job to "cluster
jms_messages; analyze jms_messages". About once per day, we go the above
ERROR and I check that pg_toast_35328 is the toast area of the jms_messages
table.
pchan@contigo.com writes:
> We just upgraded from PostgreSQL 9.1.3 to 9.1.7 last week.

Are you saying this error started to appear just after the upgrade?  Or
was it there before?

> In our database
> cluster, one database is used by the Java Message Service (JMS) system. One
> table in the database "jms_messages", is inserted, updated and then deleted
> frequently. For every five minutes, we have a cron job to "cluster
> jms_messages; analyze jms_messages". About once per day, we go the above
> ERROR and I check that pg_toast_35328 is the toast area of the jms_messages
> table.

Is that error coming from the CLUSTER operations, or other commands?  Is
it persisting across (successful) CLUSTERs?  Is it always the same toast
value OID that's being complained of?

            regards, tom lane
Yes, I did not see this ERROR before the upgrade.

Thanks,

Pius
________________________________________
From: Tom Lane [tgl@sss.pgh.pa.us]
Sent: Monday, January 21, 2013 5:53 PM
To: Pius Chan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #7819: missing chunk number 0 for toast value 12359=
19 in pg_toast_35328

pchan@contigo.com writes:
> We just upgraded from PostgreSQL 9.1.3 to 9.1.7 last week.

Are you saying this error started to appear just after the upgrade?  Or
was it there before?

> In our database
> cluster, one database is used by the Java Message Service (JMS) system. O=
ne
> table in the database "jms_messages", is inserted, updated and then delet=
ed
> frequently. For every five minutes, we have a cron job to "cluster
> jms_messages; analyze jms_messages". About once per day, we go the above
> ERROR and I check that pg_toast_35328 is the toast area of the jms_messag=
es
> table.

Is that error coming from the CLUSTER operations, or other commands?  Is
it persisting across (successful) CLUSTERs?  Is it always the same toast
value OID that's being complained of?

                        regards, tom lane
Hi Tom,

Yes, we start seeing this ERROR after upgrade to 9.1.7. The ERROR is from t=
he "cluster jms_messages" command.  Last night, the ERROR happened three ti=
mes:

(1) at 00:20:01
 ERROR:  missing chunk number 0 for toast value 1239124 in pg_toast_35328

(2) at 00:25:01
ERROR:  missing chunk number 0 for toast value 1239124 in pg_toast_35328

(3) at 00:35:01
 ERROR:  missing chunk number 0 for toast value 1241022 in pg_toast_35328

The "cluster jms_messages" runs every 5 minutes. However, so far, it seems =
that the ERROR happens at about mid-night and 35328 is the toast area of th=
e "jms_message" table:

-[ RECORD 1 ]-+-------------
oid           | 35328
relname       | jms_messages
reltoastrelid | 35331
reltoastidxid | 0

Thanks a lot,

Pius
________________________________________
From: Tom Lane [tgl@sss.pgh.pa.us]
Sent: Monday, January 21, 2013 5:53 PM
To: Pius Chan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #7819: missing chunk number 0 for toast value 12359=
19 in pg_toast_35328

pchan@contigo.com writes:
> We just upgraded from PostgreSQL 9.1.3 to 9.1.7 last week.

Are you saying this error started to appear just after the upgrade?  Or
was it there before?

> In our database
> cluster, one database is used by the Java Message Service (JMS) system. O=
ne
> table in the database "jms_messages", is inserted, updated and then delet=
ed
> frequently. For every five minutes, we have a cron job to "cluster
> jms_messages; analyze jms_messages". About once per day, we go the above
> ERROR and I check that pg_toast_35328 is the toast area of the jms_messag=
es
> table.

Is that error coming from the CLUSTER operations, or other commands?  Is
it persisting across (successful) CLUSTERs?  Is it always the same toast
value OID that's being complained of?

                        regards, tom lane=
Pius Chan <pchan@contigo.com> writes:
> Hi Tom,
> Yes, we start seeing this ERROR after upgrade to 9.1.7. The ERROR is from the "cluster jms_messages" command.  Last
night,the ERROR happened three times: 

> (1) at 00:20:01
>  ERROR:  missing chunk number 0 for toast value 1239124 in pg_toast_35328

> (2) at 00:25:01
> ERROR:  missing chunk number 0 for toast value 1239124 in pg_toast_35328

> (3) at 00:35:01
>  ERROR:  missing chunk number 0 for toast value 1241022 in pg_toast_35328

> The "cluster jms_messages" runs every 5 minutes. However, so far, it seems that the ERROR happens at about mid-night
and35328 is the toast area of the "jms_message" table: 

So what else is this application doing around midnight?  In particular,
it seems like something must've happened between 00:15 and 00:20 to
create the problem with OID 1239124, and then something else happened
between 00:25 and 00:30 to get rid of it.  And then a similar sequence
happened between 00:30 and 00:40 involving OID 1241022.  Most likely the
trigger events are application actions against jms_messages, but we have
no info about what.  (It's also possible that autovacuums of
jms_messages are involved, so you might want to crank up your logging
enough so you can see those in the postmaster log.)

I had first thought that this might have something to do with the
toast-value-processing changes we made in CLUSTER a year or so ago,
but a look in the commit logs confirms that those patches were in 9.1.3.
So I have no idea what might've broken between 9.1.3 and 9.1.7.  We
need data, or even better an actual test case.

            regards, tom lane
Hi Tom,

We actually may find the cause of the problem. Last night we found two seve=
ral old rows in the jms_messages table which should be gone long time ago. =
Once we deleted them, everything is back to normal, no more ERROR message. =
We suspected these rows are corrupted.

Thanks a lot for your help.

Pius
________________________________________
From: Tom Lane [tgl@sss.pgh.pa.us]
Sent: Tuesday, January 22, 2013 8:14 PM
To: Pius Chan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #7819: missing chunk number 0 for toast value 12359=
19 in pg_toast_35328

Pius Chan <pchan@contigo.com> writes:
> Hi Tom,
> Yes, we start seeing this ERROR after upgrade to 9.1.7. The ERROR is from=
 the "cluster jms_messages" command.  Last night, the ERROR happened three =
times:

> (1) at 00:20:01
>  ERROR:  missing chunk number 0 for toast value 1239124 in pg_toast_35328

> (2) at 00:25:01
> ERROR:  missing chunk number 0 for toast value 1239124 in pg_toast_35328

> (3) at 00:35:01
>  ERROR:  missing chunk number 0 for toast value 1241022 in pg_toast_35328

> The "cluster jms_messages" runs every 5 minutes. However, so far, it seem=
s that the ERROR happens at about mid-night and 35328 is the toast area of =
the "jms_message" table:

So what else is this application doing around midnight?  In particular,
it seems like something must've happened between 00:15 and 00:20 to
create the problem with OID 1239124, and then something else happened
between 00:25 and 00:30 to get rid of it.  And then a similar sequence
happened between 00:30 and 00:40 involving OID 1241022.  Most likely the
trigger events are application actions against jms_messages, but we have
no info about what.  (It's also possible that autovacuums of
jms_messages are involved, so you might want to crank up your logging
enough so you can see those in the postmaster log.)

I had first thought that this might have something to do with the
toast-value-processing changes we made in CLUSTER a year or so ago,
but a look in the commit logs confirms that those patches were in 9.1.3.
So I have no idea what might've broken between 9.1.3 and 9.1.7.  We
need data, or even better an actual test case.

                        regards, tom lane
Hi Tom,

The ERROR happened again. After several days of investigation and testing, =
I can now reproduce the ERROR in my testing environment. The reason why the=
 ERROR used to happen in a certain time period is that our report batch job=
s run in that period and the batch job can make the TOAST area grow.  I can=
 repeat the ERROR with this set up and testing procedure.

Set up
=3D=3D=3D=3D=3D
(1) check that the configuration parameter log_autovacuum_min_duration =3D =
0
(2) in a 9.1.7 database cluster, create a database called "TEST_LOCK".
(3) In the database "TEST_LOCK", create one table called LOCK with one row =
in it:
     create table lock (id bigint, constraint lock_pk primary key (id));
     insert into lock values (1);

(4) in the same database cluster, create another database called "TEST".
(5) in the database "TEST", create one table test with the following column=
s:

create table test (
     id                          bigint not null,
     txop                     char(1),
     messageblob    bytea,
     constraint test_pk primary key (id)
);


Pre-testing set up:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D  =20
(1) use psql to connect to TEST_LOCK and issue the following SQLs and leave=
 it open during the test:
      begin;
      select * from lock where id =3D1 for update;

(2) use another psql to connect to TEST_LOCK and issue the same SQLs as in =
(1). This session will be blocked by (1).
(3) this is to simulate the situation that in our production environment, o=
ur middle tier service use the TEST_LOCK.lock table for master/slave fail-o=
ver. We run the middle-tier service in several servers.  When the middle-ti=
er service starts up, it tries to put a lock on that ROW.  The server succe=
ssfully holds this LOCK becomes the master.


Testing set-up:
=3D=3D=3D=3D=3D=3D=3D=3D=3D
Part 1 - traffic that will consume the TOAST area of the table "test" in TE=
ST database
(1) use program to INSERT INTO TEST with length of messaegblob > 1MB and tx=
op=3D'A'. This is to ensure the TOAST area of the test table is used.  Then=
 update the table txop to 'D.  Then delete the same row.  These operations =
are in their own transactions.
(2) this simulates the batch report traffic we see from 00:15 - 00:45 every=
 night.


Part 2 - traffic that will not consume the TOAST area of the table "test"
(1) use program to INSERT INTO TEST with length of messageblob of size abou=
t 3000 bytes and txop=3D'A'.  Then update the row.txop to 'D'. Then delete =
the same row. These operations are in their own transactions.
(2) this simulates the background traffic we see during the whole day.


For part 1 and part 2, the traffic must be high enough so that from the dat=
abase log:
(1) autovacuum for the table TEST.public.test runs every minute
(2) autovacuum for the table TOAST area runs every minute
(3) from result of analyze verbose test, the # of pages scanned keeps on in=
creasing, even with autovacuum running on the table
(4) similarly, when querying the size of the TOAST, its size keeps on incre=
asing, even with autovacuum running on the TOAST every one 1 minute
(5) cluster the table test can not reduce the table size and TOAST area siz=
e as well
(6) Any any time, the # of live rows is only a few but the # of dead rows i=
s high.




Testing procedures:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
(1) use psql to connect to TEST and frequently analyze the table so as to m=
onitor the size of the table.  Run SQL to check the size of the table TOAST=
 area as well.
(2) tail the database log and check that the automatic vacuum of the table =
and its TOAST area happens every one minute
(3) Wait until the size of the table is at least 400 pages and the TOAST ar=
ea is at least 20MB.
(4) wait for the database log to emit the "automatic vacuum of table "TEST.=
pg_toast.pg_toast_xxxx": index scans:  (pg_toast_xxxx is the TOAST area of =
the table test),
IMMEDIATELY issue the command to cluster the table test in the psql session=
:

    cluster test using test_pk;


Testing results:
=3D=3D=3D=3D=3D=3D=3D=3D=3D
(1) If the timing is right, occasionally I got the ERROR (about once out of=
 ten):

ERROR:  missing chunk number 0 for toast value 13113341 in pg_toast_1308452=
0

(2) Once I got the ERROR, subsequent CLUSTER command gets the same ERROR un=
til the next autovacuum of the TOAST is done.

(3) I tested in PostgreSQL 9.1.7 and 9.1.3 and can get the ERROR.

So this is not a 9.1.7 specific bug.


Thanks a lot for your help.

Pius
________________________________________
From: Tom Lane [tgl@sss.pgh.pa.us]
Sent: Tuesday, January 22, 2013 8:14 PM
To: Pius Chan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #7819: missing chunk number 0 for toast value 12359=
19 in pg_toast_35328

Pius Chan <pchan@contigo.com> writes:
> Hi Tom,
> Yes, we start seeing this ERROR after upgrade to 9.1.7. The ERROR is from=
 the "cluster jms_messages" command.  Last night, the ERROR happened three =
times:

> (1) at 00:20:01
>  ERROR:  missing chunk number 0 for toast value 1239124 in pg_toast_35328

> (2) at 00:25:01
> ERROR:  missing chunk number 0 for toast value 1239124 in pg_toast_35328

> (3) at 00:35:01
>  ERROR:  missing chunk number 0 for toast value 1241022 in pg_toast_35328

> The "cluster jms_messages" runs every 5 minutes. However, so far, it seem=
s that the ERROR happens at about mid-night and 35328 is the toast area of =
the "jms_message" table:

So what else is this application doing around midnight?  In particular,
it seems like something must've happened between 00:15 and 00:20 to
create the problem with OID 1239124, and then something else happened
between 00:25 and 00:30 to get rid of it.  And then a similar sequence
happened between 00:30 and 00:40 involving OID 1241022.  Most likely the
trigger events are application actions against jms_messages, but we have
no info about what.  (It's also possible that autovacuums of
jms_messages are involved, so you might want to crank up your logging
enough so you can see those in the postmaster log.)

I had first thought that this might have something to do with the
toast-value-processing changes we made in CLUSTER a year or so ago,
but a look in the commit logs confirms that those patches were in 9.1.3.
So I have no idea what might've broken between 9.1.3 and 9.1.7.  We
need data, or even better an actual test case.

                        regards, tom lane
Pius Chan <pchan@contigo.com> writes:
> The ERROR happened again. After several days of investigation and testing, I can now reproduce the ERROR in my
testingenvironment. The reason why the ERROR used to happen in a certain time period is that our report batch jobs run
inthat period and the batch job can make the TOAST area grow.  I can repeat the ERROR with this set up and testing
procedure.

Thanks.  It would've been easier if you'd provided a more concrete test
procedure, but I've been able to reproduce what seems to be the same
failure.  I don't know exactly what to do about it yet :-( --- see
http://www.postgresql.org/message-id/20362.1359747327@sss.pgh.pa.us

At the moment it appears to me that this error could only occur in
CLUSTER or its close cousin VACUUM FULL; ordinary database queries could
not see such a failure.  Does that agree with your experience?  If so,
this isn't really a data loss bug, so you should be able to just live
with it until we can work out a fix.

            regards, tom lane
Tom,

Thanks for your prompt response. Yeah, I should have provided you with my t=
esting scripts. BTW, during numerous tests, I felt that if there is no long=
 holding transaction (the one used for middle-tier service master/slave fai=
lover), the database server is much quicker to recover the space left by de=
ad-row and it is also hard to make the TOAST area grow. Therefore, it is ha=
rd for me to reproduce the ERROR if there is no long-holding open transacti=
on. Do you have any insight to it?

Regards,

Pius
________________________________________
From: Tom Lane [tgl@sss.pgh.pa.us]
Sent: Friday, February 01, 2013 11:41 AM
To: Pius Chan
Cc: pgsql-bugs@postgresql.org; Frank Moi; Ken Yu; Vincent Lasmarias; Vladim=
ir Kosilov
Subject: Re: [BUGS] BUG #7819: missing chunk number 0 for toast value 12359=
19 in pg_toast_35328

Pius Chan <pchan@contigo.com> writes:
> The ERROR happened again. After several days of investigation and testing=
, I can now reproduce the ERROR in my testing environment. The reason why t=
he ERROR used to happen in a certain time period is that our report batch j=
obs run in that period and the batch job can make the TOAST area grow.  I c=
an repeat the ERROR with this set up and testing procedure.

Thanks.  It would've been easier if you'd provided a more concrete test
procedure, but I've been able to reproduce what seems to be the same
failure.  I don't know exactly what to do about it yet :-( --- see
http://www.postgresql.org/message-id/20362.1359747327@sss.pgh.pa.us

At the moment it appears to me that this error could only occur in
CLUSTER or its close cousin VACUUM FULL; ordinary database queries could
not see such a failure.  Does that agree with your experience?  If so,
this isn't really a data loss bug, so you should be able to just live
with it until we can work out a fix.

                        regards, tom lane
Pius Chan <pchan@contigo.com> writes:
> Thanks for your prompt response. Yeah, I should have provided you with my testing scripts. BTW, during numerous
tests,I felt that if there is no long holding transaction (the one used for middle-tier service master/slave failover),
thedatabase server is much quicker to recover the space left by dead-row and it is also hard to make the TOAST area
grow.Therefore, it is hard for me to reproduce the ERROR if there is no long-holding open transaction. Do you have any
insightto it? 

I think the proximate cause is probably this case mentioned in
GetOldestXmin's comments:

 * if allDbs is FALSE and there are no transactions running in the current
 * database, GetOldestXmin() returns latestCompletedXid. If a transaction
 * begins after that, its xmin will include in-progress transactions in other
 * databases that started earlier, so another call will return a lower value.

So the trouble case is where autovacuum on the toast table starts at an
instant where nothing's running in the "test" database, but there are
pre-existing transaction(s) in the other database.  Then later CLUSTER
starts at an instant where transactions are running in "test" and their
xmins include the pre-existing transactions.

So you need long-running transactions in another DB than the one where
the vacuuming/clustering action is happening, as well as some unlucky
timing.  Assuming my theory is the correct one, of course.

            regards, tom lane