Thread: Strange problem when upgrading to 7.2 with pg_upgrade.

Strange problem when upgrading to 7.2 with pg_upgrade.

From

Brian Hirt

Date:

14 February 2002, 19:22:09

I've started playing around with 7.2 on one of my development machines. 
I decided to try the pg_upgrade program, something I usually never do.

Anyway, I followed the steps in the pg_upgrade (going from 7.1.3 to
7.2), and then when I started the database up after the upgrade finished
and vacuumed one of my tables, i get these error messages from the
postmaster.  After this point I cannot restart the postmaster without
resetting the xlog.

I've kept the PGDATA directory around incase someone thinks this is
worth looking into, i would be more than happy to help out. 

If i migrate the data over manually like a always do (pg_dump then
pg_restore), i don't have any problems.  Part of the problem might be
path names for shared libraries specified in CREATE FUNCTION; I started
using pg back when it was version 6 before '$libdir' was supported and I
haven't bothered to take the absolute path names out yet -- i've just
updated it with each release (each release is installed in a different
location in case i need to roll back, and so i can test multiple version
at one time).  not sure if pg_upgrade even checks for this.

oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B
256 -D/mo
DEBUG:  database system was shut down at 2002-02-14 12:20:53 MST
DEBUG:  checkpoint record is at 1/A7000010
DEBUG:  redo record is at 1/A7000010; undo record is at 1/A7000010;
shutdown TRUE
DEBUG:  next transaction id: 589031; next oid: 19512
DEBUG:  database system is ready



DEBUG:  --Relation developer--
DEBUG:  Pages 669: Changed 0, Empty 0; Tup 51508: Vac 0, Keep 0, UnUsed
0.Total CPU 0.07s/0.03u sec elapsed 0.11 sec.
DEBUG:  Analyzing developer
FATAL 2:  read of clog file 0, offset 139264 failed: Success
DEBUG:  server process (pid 17786) exited with exit code 2
DEBUG:  terminating any other active server processes
NOTICE:  Message from PostgreSQL backend:The Postmaster has informed me that some other backenddied abnormally and
possiblycorrupted shared memory.I have rolled back the current transaction and amgoing to terminate your database
systemconnection and exit.Please reconnect to the database system and repeat your query.
 
DEBUG:  all server processes terminated; reinitializing shared memory
and semaphores
DEBUG:  database system was interrupted at 2002-02-14 12:20:58 MST
DEBUG:  checkpoint record is at 1/A7000010
DEBUG:  redo record is at 1/A7000010; undo record is at 1/A7000010;
shutdown TRUE
DEBUG:  next transaction id: 589031; next oid: 19512
DEBUG:  database system was not properly shut down; automatic recovery
in progress
DEBUG:  redo starts at 1/A7000050
FATAL 2:  read of clog file 0, offset 139264 failed: Success
DEBUG:  startup process (pid 17788) exited with exit code 2
DEBUG:  aborting startup due to startup process failure
[postgres@loopy pg_upgrade]$ 
[postgres@loopy pg_upgrade]$ 
[postgres@loopy pg_upgrade]$ 
[postgres@loopy pg_upgrade]$ df -k
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/hda8               248895    192496     43549  82% /
/dev/hda1                31079      4988     24487  17% /boot
/dev/hda5             24080660   6601476  17479184  28% /home
/dev/hda6              5044156   1930892   2857032  41% /usr
/dev/hda9               248895    133875    102170  57% /var
/dev/hdd1             59919196  39090008  20829188  66% /disk
oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B
256 -D/mo
DEBUG:  database system was interrupted being in recovery at 2002-02-14
12:21:06 MSTThis probably means that some data blocks are corruptedand you will have to use the last backup for
recovery.
DEBUG:  checkpoint record is at 1/A7000010
DEBUG:  redo record is at 1/A7000010; undo record is at 1/A7000010;
shutdown TRUE
DEBUG:  next transaction id: 589031; next oid: 19512
DEBUG:  database system was not properly shut down; automatic recovery
in progress
DEBUG:  redo starts at 1/A7000050
FATAL 2:  read of clog file 0, offset 139264 failed: Success
DEBUG:  startup process (pid 17793) exited with exit code 2
DEBUG:  aborting startup due to startup process failure
[postgres@loopy pg_upgrade]$

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Tom Lane

Date:

14 February 2002, 20:12:52

Brian Hirt <bhirt@mobygames.com> writes:
> I decided to try the pg_upgrade program, something I usually never do.

> FATAL 2:  read of clog file 0, offset 139264 failed: Success

Could we see ls -l $PGDATA/pg_clog?

I suspect that pg_upgrade has neglected to make sure the clog is long
enough.
        regards, tom lane

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Brian Hirt

Date:

14 February 2002, 21:23:48

[root@loopy pg_clog]# pwd
/moby/pgsql-upgrade-bad/pg_clog
[root@loopy pg_clog]# ls -la
total 9
drwx------    2 postgres postgres       72 Feb 14 09:32 .
drwx------    6 postgres postgres      304 Feb 14 16:02 ..
-rw-------    1 postgres postgres     8192 Feb 14 09:34 0000
[root@loopy pg_clog]# bzip2 < 0000 | uuencode -  
begin 644 -
M0EIH.3%!62936<[:PW<``#Y_".;,1H``L!``9@!F``(`"```"#``V*#5/R*>
MHT--`TT!31H`T``"1"(TT*:#U/4>C]8_(/$);W"6=D0`'3$(Z9Y_D(V@K9T)
M+,\6"GDBTU?,C9R[NSB.6-X6M3\55RS<AS$:?0<,;N4/K>#.KV(E,[88LWG%
M[:QR6B"\'JK2G9LB*63"00449P7!2)#0O3IY4PT;P%DC'J$M$T3$5'RU5';2
A*:2EB*:1)!MI,SQ%1=GE_(FY2U#027L7<D4X4)#.VL-W
`
end
[root@loopy pg_clog]# 



On Thu, 2002-02-14 at 17:54, Tom Lane wrote:
> Brian Hirt <bhirt@mobygames.com> writes:
> > I decided to try the pg_upgrade program, something I usually never do.
> 
> > FATAL 2:  read of clog file 0, offset 139264 failed: Success
> 
> Could we see ls -l $PGDATA/pg_clog?
> 
> I suspect that pg_upgrade has neglected to make sure the clog is long
> enough.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Bruce Momjian

Date:

22 February 2002, 00:03:16

Tom Lane wrote:
> Brian Hirt <bhirt@mobygames.com> writes:
> > I decided to try the pg_upgrade program, something I usually never do.
> 
> > FATAL 2:  read of clog file 0, offset 139264 failed: Success
> 
> Could we see ls -l $PGDATA/pg_clog?
> 
> I suspect that pg_upgrade has neglected to make sure the clog is long
> enough.

Here is the code that sets the transaction id.  Tom, does pg_resetxlog
handle pg_clog file creation properly?# Set this so future backends don't think these tuples are their own# because it
matchestheir own XID.# Commit status already updated by vacuum above# Set to maximum XID just in case SRC wrapped
aroundrecently and# is lower than DST's databaseif [ "$SRC_XID" -gt "$DST_XID" ]then    MAX_XID="$SRC_XID"else
MAX_XID="$DST_XID"fipg_resetxlog-x "$MAX_XID" "$PGDATA"if [ "$?" -ne 0 ]then    echo "Unable to set new XID.  Exiting."
1>&2   exit 1fi

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Tom Lane

Date:

22 February 2002, 00:08:53

Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> I suspect that pg_upgrade has neglected to make sure the clog is long
>> enough.

> Here is the code that sets the transaction id.  Tom, does pg_resetxlog
> handle pg_clog file creation properly?

pg_resetxlog doesn't know a single solitary thing about the clog.

The problem here is that if you're going to move the current xact ID
forward, you need to be prepared to create pages of the clog
accordingly.  Or maybe the clog routines need to be less rigid in their
assumptions, but I'm uncomfortable with relaxing their expectations
unless it can be shown that they may fail to cope with cases that
arise in normal system operation.  This isn't such a case.
        regards, tom lane

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Bruce Momjian

Date:

22 February 2002, 00:13:26

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> I suspect that pg_upgrade has neglected to make sure the clog is long
> >> enough.
> 
> > Here is the code that sets the transaction id.  Tom, does pg_resetxlog
> > handle pg_clog file creation properly?
> 
> pg_resetxlog doesn't know a single solitary thing about the clog.
> 
> The problem here is that if you're going to move the current xact ID
> forward, you need to be prepared to create pages of the clog
> accordingly.  Or maybe the clog routines need to be less rigid in their
> assumptions, but I'm uncomfortable with relaxing their expectations
> unless it can be shown that they may fail to cope with cases that
> arise in normal system operation.  This isn't such a case.

We increased the xid because the old files have xid's that are greater
than the newly initdb'ed database.  We did a vacuum, so no one is going
to check clog, but we need to increase the transaction counter because
old rows could be seen as matching the current transaction.

Can you suggest how to create the needed clog files?  I don't see any
value in changing your current clog code in the backend.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Bruce Momjian

Date:

22 February 2002, 07:42:21

> We increased the xid because the old files have xid's that are greater
> than the newly initdb'ed database.  We did a vacuum, so no one is going
> to check clog, but we need to increase the transaction counter because
> old rows could be seen as matching the current transaction.
> 
> Can you suggest how to create the needed clog files?  I don't see any
> value in changing your current clog code in the backend.

Tom, is there a way to increment the XID every 100 million and start the
postmaster to create the needed pg_clog files to get to the XID I need?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Bruce Momjian

Date:

09 April 2002, 00:21:10

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> I suspect that pg_upgrade has neglected to make sure the clog is long
> >> enough.
> 
> > Here is the code that sets the transaction id.  Tom, does pg_resetxlog
> > handle pg_clog file creation properly?
> 
> pg_resetxlog doesn't know a single solitary thing about the clog.
> 
> The problem here is that if you're going to move the current xact ID
> forward, you need to be prepared to create pages of the clog
> accordingly.  Or maybe the clog routines need to be less rigid in their
> assumptions, but I'm uncomfortable with relaxing their expectations
> unless it can be shown that they may fail to cope with cases that
> arise in normal system operation.  This isn't such a case.

Tom, any suggestion on how I can increase clog as part of pg_upgrade?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Tom Lane

Date:

09 April 2002, 00:22:25

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom, any suggestion on how I can increase clog as part of pg_upgrade?

Append zeroes ...
        regards, tom lane

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Bruce Momjian

Date:

09 April 2002, 00:26:18

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom, any suggestion on how I can increase clog as part of pg_upgrade?
> 
> Append zeroes ...

OK, I can 'dd' /dev/zero to append zeros to pad out the file.  How large
does the clog file get, 1gb?  Do I need to rename it at all?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Tom Lane

Date:

09 April 2002, 00:28:44

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> OK, I can 'dd' /dev/zero to append zeros to pad out the file.  How large
> does the clog file get, 1gb?  Do I need to rename it at all?

256KB per segment.  Do *not* rename existing segments.
        regards, tom lane

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Bruce Momjian

Date:

09 April 2002, 00:35:43

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > OK, I can 'dd' /dev/zero to append zeros to pad out the file.  How large
> > does the clog file get, 1gb?  Do I need to rename it at all?
> 
> 256KB per segment.  Do *not* rename existing segments.

Right, no rename, but I will have to create additional files in 256kb
chunks, and I assume 1gb of chunks remains in pg_clog directory?

Since I have done a vacuum, I assume I just keep creating 256k chunks
until I reach the max xid from the previous release, and delete the
files prior to the 1gb size limit.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Tom Lane

Date:

09 April 2002, 00:38:01

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Since I have done a vacuum, I assume I just keep creating 256k chunks
> until I reach the max xid from the previous release, and delete the
> files prior to the 1gb size limit.

Keep your hands *off* the existing segments.  The CLOG code will clean
them up when it's good and ready ...
        regards, tom lane

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Bruce Momjian

Date:

09 April 2002, 00:44:44

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Since I have done a vacuum, I assume I just keep creating 256k chunks
> > until I reach the max xid from the previous release, and delete the
> > files prior to the 1gb size limit.
> 
> Keep your hands *off* the existing segments.  The CLOG code will clean
> them up when it's good and ready ...

OK.  Fill out the current clog and add additional ones to reach the
current max xid, rounded to the nearest 8k, assuming 256k file equals
1mb of xids.

Why do you take these things so personally?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Bruce Momjian

Date:

09 April 2002, 14:17:10

This is a good bug report.  I can fix pg_upgrade by adding clog files
containing zeros to pad out to the proper length.  However, my guess is
that most people have already upgrade to 7.2.X, so there isn't much
value in fixing it now.  I have updated pg_upgrade CVS for 7.3, and
hopefully we will have it working and well tested by the time 7.3 is
released.

Compressed clog was new in 7.2, so I guess it is no surprise I missed
that change in pg_upgrade.  In 7.3, pg_clog will be moved over from the
old install, so this shouldn't be a problem with 7.3.

Thanks for the report.  Sorry I don't have a fix.

---------------------------------------------------------------------------

Brian Hirt wrote:
> I've started playing around with 7.2 on one of my development machines. 
> I decided to try the pg_upgrade program, something I usually never do.
> 
> Anyway, I followed the steps in the pg_upgrade (going from 7.1.3 to
> 7.2), and then when I started the database up after the upgrade finished
> and vacuumed one of my tables, i get these error messages from the
> postmaster.  After this point I cannot restart the postmaster without
> resetting the xlog.
> 
> I've kept the PGDATA directory around incase someone thinks this is
> worth looking into, i would be more than happy to help out. 
> 
> If i migrate the data over manually like a always do (pg_dump then
> pg_restore), i don't have any problems.  Part of the problem might be
> path names for shared libraries specified in CREATE FUNCTION; I started
> using pg back when it was version 6 before '$libdir' was supported and I
> haven't bothered to take the absolute path names out yet -- i've just
> updated it with each release (each release is installed in a different
> location in case i need to roll back, and so i can test multiple version
> at one time).  not sure if pg_upgrade even checks for this.
> 
> oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B
> 256 -D/mo
> DEBUG:  database system was shut down at 2002-02-14 12:20:53 MST
> DEBUG:  checkpoint record is at 1/A7000010
> DEBUG:  redo record is at 1/A7000010; undo record is at 1/A7000010;
> shutdown TRUE
> DEBUG:  next transaction id: 589031; next oid: 19512
> DEBUG:  database system is ready
> 
> 
> 
> DEBUG:  --Relation developer--
> DEBUG:  Pages 669: Changed 0, Empty 0; Tup 51508: Vac 0, Keep 0, UnUsed
> 0.
>     Total CPU 0.07s/0.03u sec elapsed 0.11 sec.
> DEBUG:  Analyzing developer
> FATAL 2:  read of clog file 0, offset 139264 failed: Success
> DEBUG:  server process (pid 17786) exited with exit code 2
> DEBUG:  terminating any other active server processes
> NOTICE:  Message from PostgreSQL backend:
>     The Postmaster has informed me that some other backend
>     died abnormally and possibly corrupted shared memory.
>     I have rolled back the current transaction and am
>     going to terminate your database system connection and exit.
>     Please reconnect to the database system and repeat your query.
> DEBUG:  all server processes terminated; reinitializing shared memory
> and semaphores
> DEBUG:  database system was interrupted at 2002-02-14 12:20:58 MST
> DEBUG:  checkpoint record is at 1/A7000010
> DEBUG:  redo record is at 1/A7000010; undo record is at 1/A7000010;
> shutdown TRUE
> DEBUG:  next transaction id: 589031; next oid: 19512
> DEBUG:  database system was not properly shut down; automatic recovery
> in progress
> DEBUG:  redo starts at 1/A7000050
> FATAL 2:  read of clog file 0, offset 139264 failed: Success
> DEBUG:  startup process (pid 17788) exited with exit code 2
> DEBUG:  aborting startup due to startup process failure
> [postgres@loopy pg_upgrade]$ 
> [postgres@loopy pg_upgrade]$ 
> [postgres@loopy pg_upgrade]$ 
> [postgres@loopy pg_upgrade]$ df -k
> Filesystem           1k-blocks      Used Available Use% Mounted on
> /dev/hda8               248895    192496     43549  82% /
> /dev/hda1                31079      4988     24487  17% /boot
> /dev/hda5             24080660   6601476  17479184  28% /home
> /dev/hda6              5044156   1930892   2857032  41% /usr
> /dev/hda9               248895    133875    102170  57% /var
> /dev/hdd1             59919196  39090008  20829188  66% /disk
> oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B
> 256 -D/mo
> DEBUG:  database system was interrupted being in recovery at 2002-02-14
> 12:21:06 MST
>     This probably means that some data blocks are corrupted
>     and you will have to use the last backup for recovery.
> DEBUG:  checkpoint record is at 1/A7000010
> DEBUG:  redo record is at 1/A7000010; undo record is at 1/A7000010;
> shutdown TRUE
> DEBUG:  next transaction id: 589031; next oid: 19512
> DEBUG:  database system was not properly shut down; automatic recovery
> in progress
> DEBUG:  redo starts at 1/A7000050
> FATAL 2:  read of clog file 0, offset 139264 failed: Success
> DEBUG:  startup process (pid 17793) exited with exit code 2
> DEBUG:  aborting startup due to startup process failure
> [postgres@loopy pg_upgrade]$ 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

"Mattew T. O'Connor"

Date:

09 April 2002, 15:25:40

I wouldn't be so quick to assume that almost everyone has upgraded by now.  I 
know we have not, at least not in production.

On Tuesday 09 April 2002 02:14 pm, Bruce Momjian wrote:
> This is a good bug report.  I can fix pg_upgrade by adding clog files
> containing zeros to pad out to the proper length.  However, my guess is
> that most people have already upgrade to 7.2.X, so there isn't much
> value in fixing it now.  I have updated pg_upgrade CVS for 7.3, and
> hopefully we will have it working and well tested by the time 7.3 is
> released.
>
> Compressed clog was new in 7.2, so I guess it is no surprise I missed
> that change in pg_upgrade.  In 7.3, pg_clog will be moved over from the
> old install, so this shouldn't be a problem with 7.3.
>
> Thanks for the report.  Sorry I don't have a fix.

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Bradley McLean

Date:

09 April 2002, 19:25:47

* Mattew T. O'Connor (matthew@zeut.net) [020409 15:34]:
> I wouldn't be so quick to assume that almost everyone has upgraded by now.  I 
> know we have not, at least not in production.

yeah, what he said.  Test, QA and development yes, production, no.

-Brad

Re: Strange problem when upgrading to 7.2 with pg_upgrade.

From

Bruce Momjian

Date:

09 April 2002, 20:06:39

Bradley McLean wrote:
> * Mattew T. O'Connor (matthew@zeut.net) [020409 15:34]:
> > I wouldn't be so quick to assume that almost everyone has upgraded by now.  I 
> > know we have not, at least not in production.
> 
> yeah, what he said.  Test, QA and development yes, production, no.

The question is anyone who has delayed installing 7.2 will be using
pg_upgrade.  Odds are they will not, and clearly we can't get enough
testing on pg_upgrade to be sure it will work well with 7.2.X.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026