Thread: pg_standby could not open wal file after selecting new timeline
I am trying to move a db from one machine to another. pg_standby applies all the logs fine, then I trigger it and this happens ??? 2008-11-05 11:43:45 EST [14853] LOG: restored log file "00000003000016ED0000007E" from archive 2008-11-05 11:43:45 EST [14853] LOG: selected new timeline ID: 4 2008-11-05 11:43:45 EST [14853] LOG: archive recovery complete 2008-11-05 11:43:48 EST [14853] FATAL: could not open file "pg_xlog/ 00000004000016ED0000007F" (log file 5869, segment 127): Invalid argument 2008-11-05 11:43:48 EST [14846] LOG: startup process (PID 14853) exited with exit code 1
Dave Cramer <pg@fastcrypt.com> writes: > 2008-11-05 11:43:45 EST [14853] LOG: selected new timeline ID: 4 > 2008-11-05 11:43:45 EST [14853] LOG: archive recovery complete > 2008-11-05 11:43:48 EST [14853] FATAL: could not open file "pg_xlog/ > 00000004000016ED0000007F" (log file 5869, segment 127): Invalid argument "Invalid argument"?? On the platforms I have handy, the only documented reason for open(2) to fail with EINVAL is illegal value of the flags argument, which should be impossible. What platform is this and what wal_sync_method are you using? regards, tom lane
Tom, On 5-Nov-08, at 12:21 PM, Tom Lane wrote: > nvalid argument"?? On the platforms I have handy, the only documented > reason for open(2) to fail with EINVAL is illegal value of the flags > argument, which should be impossible. What platform is this and what > wal_sync_method are you using? Red Hat Enterprise Linux Server release 5.2 (Tikanga) wal_sync method is open_sync Thing is the server is running off of a ramdisk for the move (very temporarily) Dave
Dave Cramer <pg@fastcrypt.com> writes: > On 5-Nov-08, at 12:21 PM, Tom Lane wrote: >> nvalid argument"?? On the platforms I have handy, the only documented >> reason for open(2) to fail with EINVAL is illegal value of the flags >> argument, which should be impossible. What platform is this and what >> wal_sync_method are you using? > Red Hat Enterprise Linux Server release 5.2 (Tikanga) > wal_sync method is open_sync > Thing is the server is running off of a ramdisk for the move (very > temporarily) Huh, is it possible that Linux rejects O_SYNC for a file on ramdisk? I guess I could see an argument for doing that but it seems a tad anal-retentive. Try setting fsync = off and see what happens. regards, tom lane
I wrote: > Huh, is it possible that Linux rejects O_SYNC for a file on ramdisk? I found this in the Fedora 9 manpage for open(2): O_DIRECT support was added under Linux in kernel version 2.4.10. Older Linux kernels simply ignore this flag. Some filesystems may not imple- ment the flag and open() will fail with EINVAL if it is used. so it may not be ramdisk per se that's the issue, but the filesystem you're using on it. We set O_DIRECT along with O_SYNC whenever O_DIRECT is defined. I wonder whether there's a need to make that decision more configurable. regards, tom lane
On 5-Nov-08, at 1:00 PM, Tom Lane wrote: > I wrote: >> Huh, is it possible that Linux rejects O_SYNC for a file on ramdisk? > > I found this in the Fedora 9 manpage for open(2): > > O_DIRECT support was added under Linux in kernel version > 2.4.10. Older > Linux kernels simply ignore this flag. Some filesystems may > not imple- > ment the flag and open() will fail with EINVAL if it is used. > > so it may not be ramdisk per se that's the issue, but the filesystem > you're using on it. > > We set O_DIRECT along with O_SYNC whenever O_DIRECT is defined. I > wonder whether there's a need to make that decision more configurable. > fsync=off works fine if that helps > regards, tom lane