Thread: Strange problem when upgrading to 7.2 with pg_upgrade.
I've started playing around with 7.2 on one of my development machines. I decided to try the pg_upgrade program, something I usually never do. Anyway, I followed the steps in the pg_upgrade (going from 7.1.3 to 7.2), and then when I started the database up after the upgrade finished and vacuumed one of my tables, i get these error messages from the postmaster. After this point I cannot restart the postmaster without resetting the xlog. I've kept the PGDATA directory around incase someone thinks this is worth looking into, i would be more than happy to help out. If i migrate the data over manually like a always do (pg_dump then pg_restore), i don't have any problems. Part of the problem might be path names for shared libraries specified in CREATE FUNCTION; I started using pg back when it was version 6 before '$libdir' was supported and I haven't bothered to take the absolute path names out yet -- i've just updated it with each release (each release is installed in a different location in case i need to roll back, and so i can test multiple version at one time). not sure if pg_upgrade even checks for this. oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B 256 -D/mo DEBUG: database system was shut down at 2002-02-14 12:20:53 MST DEBUG: checkpoint record is at 1/A7000010 DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010; shutdown TRUE DEBUG: next transaction id: 589031; next oid: 19512 DEBUG: database system is ready DEBUG: --Relation developer-- DEBUG: Pages 669: Changed 0, Empty 0; Tup 51508: Vac 0, Keep 0, UnUsed 0.Total CPU 0.07s/0.03u sec elapsed 0.11 sec. DEBUG: Analyzing developer FATAL 2: read of clog file 0, offset 139264 failed: Success DEBUG: server process (pid 17786) exited with exit code 2 DEBUG: terminating any other active server processes NOTICE: Message from PostgreSQL backend:The Postmaster has informed me that some other backenddied abnormally and possiblycorrupted shared memory.I have rolled back the current transaction and amgoing to terminate your database systemconnection and exit.Please reconnect to the database system and repeat your query. DEBUG: all server processes terminated; reinitializing shared memory and semaphores DEBUG: database system was interrupted at 2002-02-14 12:20:58 MST DEBUG: checkpoint record is at 1/A7000010 DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010; shutdown TRUE DEBUG: next transaction id: 589031; next oid: 19512 DEBUG: database system was not properly shut down; automatic recovery in progress DEBUG: redo starts at 1/A7000050 FATAL 2: read of clog file 0, offset 139264 failed: Success DEBUG: startup process (pid 17788) exited with exit code 2 DEBUG: aborting startup due to startup process failure [postgres@loopy pg_upgrade]$ [postgres@loopy pg_upgrade]$ [postgres@loopy pg_upgrade]$ [postgres@loopy pg_upgrade]$ df -k Filesystem 1k-blocks Used Available Use% Mounted on /dev/hda8 248895 192496 43549 82% / /dev/hda1 31079 4988 24487 17% /boot /dev/hda5 24080660 6601476 17479184 28% /home /dev/hda6 5044156 1930892 2857032 41% /usr /dev/hda9 248895 133875 102170 57% /var /dev/hdd1 59919196 39090008 20829188 66% /disk oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B 256 -D/mo DEBUG: database system was interrupted being in recovery at 2002-02-14 12:21:06 MSTThis probably means that some data blocks are corruptedand you will have to use the last backup for recovery. DEBUG: checkpoint record is at 1/A7000010 DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010; shutdown TRUE DEBUG: next transaction id: 589031; next oid: 19512 DEBUG: database system was not properly shut down; automatic recovery in progress DEBUG: redo starts at 1/A7000050 FATAL 2: read of clog file 0, offset 139264 failed: Success DEBUG: startup process (pid 17793) exited with exit code 2 DEBUG: aborting startup due to startup process failure [postgres@loopy pg_upgrade]$
Brian Hirt <bhirt@mobygames.com> writes: > I decided to try the pg_upgrade program, something I usually never do. > FATAL 2: read of clog file 0, offset 139264 failed: Success Could we see ls -l $PGDATA/pg_clog? I suspect that pg_upgrade has neglected to make sure the clog is long enough. regards, tom lane
[root@loopy pg_clog]# pwd /moby/pgsql-upgrade-bad/pg_clog [root@loopy pg_clog]# ls -la total 9 drwx------ 2 postgres postgres 72 Feb 14 09:32 . drwx------ 6 postgres postgres 304 Feb 14 16:02 .. -rw------- 1 postgres postgres 8192 Feb 14 09:34 0000 [root@loopy pg_clog]# bzip2 < 0000 | uuencode - begin 644 - M0EIH.3%!62936<[:PW<``#Y_".;,1H``L!``9@!F``(`"```"#``V*#5/R*> MHT--`TT!31H`T``"1"(TT*:#U/4>C]8_(/$);W"6=D0`'3$(Z9Y_D(V@K9T) M+,\6"GDBTU?,C9R[NSB.6-X6M3\55RS<AS$:?0<,;N4/K>#.KV(E,[88LWG% M[:QR6B"\'JK2G9LB*63"00449P7!2)#0O3IY4PT;P%DC'J$M$T3$5'RU5';2 A*:2EB*:1)!MI,SQ%1=GE_(FY2U#027L7<D4X4)#.VL-W ` end [root@loopy pg_clog]# On Thu, 2002-02-14 at 17:54, Tom Lane wrote: > Brian Hirt <bhirt@mobygames.com> writes: > > I decided to try the pg_upgrade program, something I usually never do. > > > FATAL 2: read of clog file 0, offset 139264 failed: Success > > Could we see ls -l $PGDATA/pg_clog? > > I suspect that pg_upgrade has neglected to make sure the clog is long > enough. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
Tom Lane wrote: > Brian Hirt <bhirt@mobygames.com> writes: > > I decided to try the pg_upgrade program, something I usually never do. > > > FATAL 2: read of clog file 0, offset 139264 failed: Success > > Could we see ls -l $PGDATA/pg_clog? > > I suspect that pg_upgrade has neglected to make sure the clog is long > enough. Here is the code that sets the transaction id. Tom, does pg_resetxlog handle pg_clog file creation properly?# Set this so future backends don't think these tuples are their own# because it matchestheir own XID.# Commit status already updated by vacuum above# Set to maximum XID just in case SRC wrapped aroundrecently and# is lower than DST's databaseif [ "$SRC_XID" -gt "$DST_XID" ]then MAX_XID="$SRC_XID"else MAX_XID="$DST_XID"fipg_resetxlog-x "$MAX_XID" "$PGDATA"if [ "$?" -ne 0 ]then echo "Unable to set new XID. Exiting." 1>&2 exit 1fi -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> I suspect that pg_upgrade has neglected to make sure the clog is long >> enough. > Here is the code that sets the transaction id. Tom, does pg_resetxlog > handle pg_clog file creation properly? pg_resetxlog doesn't know a single solitary thing about the clog. The problem here is that if you're going to move the current xact ID forward, you need to be prepared to create pages of the clog accordingly. Or maybe the clog routines need to be less rigid in their assumptions, but I'm uncomfortable with relaxing their expectations unless it can be shown that they may fail to cope with cases that arise in normal system operation. This isn't such a case. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> I suspect that pg_upgrade has neglected to make sure the clog is long > >> enough. > > > Here is the code that sets the transaction id. Tom, does pg_resetxlog > > handle pg_clog file creation properly? > > pg_resetxlog doesn't know a single solitary thing about the clog. > > The problem here is that if you're going to move the current xact ID > forward, you need to be prepared to create pages of the clog > accordingly. Or maybe the clog routines need to be less rigid in their > assumptions, but I'm uncomfortable with relaxing their expectations > unless it can be shown that they may fail to cope with cases that > arise in normal system operation. This isn't such a case. We increased the xid because the old files have xid's that are greater than the newly initdb'ed database. We did a vacuum, so no one is going to check clog, but we need to increase the transaction counter because old rows could be seen as matching the current transaction. Can you suggest how to create the needed clog files? I don't see any value in changing your current clog code in the backend. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> We increased the xid because the old files have xid's that are greater > than the newly initdb'ed database. We did a vacuum, so no one is going > to check clog, but we need to increase the transaction counter because > old rows could be seen as matching the current transaction. > > Can you suggest how to create the needed clog files? I don't see any > value in changing your current clog code in the backend. Tom, is there a way to increment the XID every 100 million and start the postmaster to create the needed pg_clog files to get to the XID I need? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> I suspect that pg_upgrade has neglected to make sure the clog is long > >> enough. > > > Here is the code that sets the transaction id. Tom, does pg_resetxlog > > handle pg_clog file creation properly? > > pg_resetxlog doesn't know a single solitary thing about the clog. > > The problem here is that if you're going to move the current xact ID > forward, you need to be prepared to create pages of the clog > accordingly. Or maybe the clog routines need to be less rigid in their > assumptions, but I'm uncomfortable with relaxing their expectations > unless it can be shown that they may fail to cope with cases that > arise in normal system operation. This isn't such a case. Tom, any suggestion on how I can increase clog as part of pg_upgrade? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Tom, any suggestion on how I can increase clog as part of pg_upgrade? Append zeroes ... regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Tom, any suggestion on how I can increase clog as part of pg_upgrade? > > Append zeroes ... OK, I can 'dd' /dev/zero to append zeros to pad out the file. How large does the clog file get, 1gb? Do I need to rename it at all? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > OK, I can 'dd' /dev/zero to append zeros to pad out the file. How large > does the clog file get, 1gb? Do I need to rename it at all? 256KB per segment. Do *not* rename existing segments. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > OK, I can 'dd' /dev/zero to append zeros to pad out the file. How large > > does the clog file get, 1gb? Do I need to rename it at all? > > 256KB per segment. Do *not* rename existing segments. Right, no rename, but I will have to create additional files in 256kb chunks, and I assume 1gb of chunks remains in pg_clog directory? Since I have done a vacuum, I assume I just keep creating 256k chunks until I reach the max xid from the previous release, and delete the files prior to the 1gb size limit. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Since I have done a vacuum, I assume I just keep creating 256k chunks > until I reach the max xid from the previous release, and delete the > files prior to the 1gb size limit. Keep your hands *off* the existing segments. The CLOG code will clean them up when it's good and ready ... regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Since I have done a vacuum, I assume I just keep creating 256k chunks > > until I reach the max xid from the previous release, and delete the > > files prior to the 1gb size limit. > > Keep your hands *off* the existing segments. The CLOG code will clean > them up when it's good and ready ... OK. Fill out the current clog and add additional ones to reach the current max xid, rounded to the nearest 8k, assuming 256k file equals 1mb of xids. Why do you take these things so personally? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
This is a good bug report. I can fix pg_upgrade by adding clog files containing zeros to pad out to the proper length. However, my guess is that most people have already upgrade to 7.2.X, so there isn't much value in fixing it now. I have updated pg_upgrade CVS for 7.3, and hopefully we will have it working and well tested by the time 7.3 is released. Compressed clog was new in 7.2, so I guess it is no surprise I missed that change in pg_upgrade. In 7.3, pg_clog will be moved over from the old install, so this shouldn't be a problem with 7.3. Thanks for the report. Sorry I don't have a fix. --------------------------------------------------------------------------- Brian Hirt wrote: > I've started playing around with 7.2 on one of my development machines. > I decided to try the pg_upgrade program, something I usually never do. > > Anyway, I followed the steps in the pg_upgrade (going from 7.1.3 to > 7.2), and then when I started the database up after the upgrade finished > and vacuumed one of my tables, i get these error messages from the > postmaster. After this point I cannot restart the postmaster without > resetting the xlog. > > I've kept the PGDATA directory around incase someone thinks this is > worth looking into, i would be more than happy to help out. > > If i migrate the data over manually like a always do (pg_dump then > pg_restore), i don't have any problems. Part of the problem might be > path names for shared libraries specified in CREATE FUNCTION; I started > using pg back when it was version 6 before '$libdir' was supported and I > haven't bothered to take the absolute path names out yet -- i've just > updated it with each release (each release is installed in a different > location in case i need to roll back, and so i can test multiple version > at one time). not sure if pg_upgrade even checks for this. > > oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B > 256 -D/mo > DEBUG: database system was shut down at 2002-02-14 12:20:53 MST > DEBUG: checkpoint record is at 1/A7000010 > DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010; > shutdown TRUE > DEBUG: next transaction id: 589031; next oid: 19512 > DEBUG: database system is ready > > > > DEBUG: --Relation developer-- > DEBUG: Pages 669: Changed 0, Empty 0; Tup 51508: Vac 0, Keep 0, UnUsed > 0. > Total CPU 0.07s/0.03u sec elapsed 0.11 sec. > DEBUG: Analyzing developer > FATAL 2: read of clog file 0, offset 139264 failed: Success > DEBUG: server process (pid 17786) exited with exit code 2 > DEBUG: terminating any other active server processes > NOTICE: Message from PostgreSQL backend: > The Postmaster has informed me that some other backend > died abnormally and possibly corrupted shared memory. > I have rolled back the current transaction and am > going to terminate your database system connection and exit. > Please reconnect to the database system and repeat your query. > DEBUG: all server processes terminated; reinitializing shared memory > and semaphores > DEBUG: database system was interrupted at 2002-02-14 12:20:58 MST > DEBUG: checkpoint record is at 1/A7000010 > DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010; > shutdown TRUE > DEBUG: next transaction id: 589031; next oid: 19512 > DEBUG: database system was not properly shut down; automatic recovery > in progress > DEBUG: redo starts at 1/A7000050 > FATAL 2: read of clog file 0, offset 139264 failed: Success > DEBUG: startup process (pid 17788) exited with exit code 2 > DEBUG: aborting startup due to startup process failure > [postgres@loopy pg_upgrade]$ > [postgres@loopy pg_upgrade]$ > [postgres@loopy pg_upgrade]$ > [postgres@loopy pg_upgrade]$ df -k > Filesystem 1k-blocks Used Available Use% Mounted on > /dev/hda8 248895 192496 43549 82% / > /dev/hda1 31079 4988 24487 17% /boot > /dev/hda5 24080660 6601476 17479184 28% /home > /dev/hda6 5044156 1930892 2857032 41% /usr > /dev/hda9 248895 133875 102170 57% /var > /dev/hdd1 59919196 39090008 20829188 66% /disk > oby/pgsql@loopy pg_upgrade]$ /moby/pgsql-7.2/bin/postmaster -i -o -F -B > 256 -D/mo > DEBUG: database system was interrupted being in recovery at 2002-02-14 > 12:21:06 MST > This probably means that some data blocks are corrupted > and you will have to use the last backup for recovery. > DEBUG: checkpoint record is at 1/A7000010 > DEBUG: redo record is at 1/A7000010; undo record is at 1/A7000010; > shutdown TRUE > DEBUG: next transaction id: 589031; next oid: 19512 > DEBUG: database system was not properly shut down; automatic recovery > in progress > DEBUG: redo starts at 1/A7000050 > FATAL 2: read of clog file 0, offset 139264 failed: Success > DEBUG: startup process (pid 17793) exited with exit code 2 > DEBUG: aborting startup due to startup process failure > [postgres@loopy pg_upgrade]$ > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
I wouldn't be so quick to assume that almost everyone has upgraded by now. I know we have not, at least not in production. On Tuesday 09 April 2002 02:14 pm, Bruce Momjian wrote: > This is a good bug report. I can fix pg_upgrade by adding clog files > containing zeros to pad out to the proper length. However, my guess is > that most people have already upgrade to 7.2.X, so there isn't much > value in fixing it now. I have updated pg_upgrade CVS for 7.3, and > hopefully we will have it working and well tested by the time 7.3 is > released. > > Compressed clog was new in 7.2, so I guess it is no surprise I missed > that change in pg_upgrade. In 7.3, pg_clog will be moved over from the > old install, so this shouldn't be a problem with 7.3. > > Thanks for the report. Sorry I don't have a fix.
* Mattew T. O'Connor (matthew@zeut.net) [020409 15:34]: > I wouldn't be so quick to assume that almost everyone has upgraded by now. I > know we have not, at least not in production. yeah, what he said. Test, QA and development yes, production, no. -Brad
Bradley McLean wrote: > * Mattew T. O'Connor (matthew@zeut.net) [020409 15:34]: > > I wouldn't be so quick to assume that almost everyone has upgraded by now. I > > know we have not, at least not in production. > > yeah, what he said. Test, QA and development yes, production, no. The question is anyone who has delayed installing 7.2 will be using pg_upgrade. Odds are they will not, and clearly we can't get enough testing on pg_upgrade to be sure it will work well with 7.2.X. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026