Thread: FATAL 2: InitRelink(logfile 0 seg 173) failed: No such file or directory
FATAL 2: InitRelink(logfile 0 seg 173) failed: No such file or directory
From
james@unifiedmind.com (James Thornton)
Date:
What does this mean, and what could be causing it? FATAL 2: InitRelink(logfile 0 seg 173) failed: No such file or directory That's the second time in as many months that I have received this error when trying to start postmaster after a crash -- both times a server reboot remedied the issue. Thanks.
Just curious ... how often does the server crash? Thanks "James Thornton" <james@unifiedmind.com> wrote in message news:cabf0e7b.0206150908.1edab2f8@posting.google.com... > What does this mean, and what could be causing it? > > FATAL 2: InitRelink(logfile 0 seg 173) failed: No such file or > directory > > That's the second time in as many months that I have received this > error when trying to start postmaster after a crash -- both times a > server reboot remedied the issue. > > Thanks.
james@unifiedmind.com (James Thornton) writes: > What does this mean, and what could be causing it? > FATAL 2: InitRelink(logfile 0 seg 173) failed: No such file or > directory > That's the second time in as many months that I have received this > error when trying to start postmaster after a crash -- both times a > server reboot remedied the issue. That really should be impossible --- it says that a rename() failed for a file we just created. I judge from the spelling of the error message that you are running 7.1. I would recommend an update to 7.2, wherein the error message looks more like this: if (rename(tmppath, path) < 0) elog(STOP, "rename from %s to %s (initialization of log file %u, segment %u) failed:%m", tmppath, path, log, seg); (Alternatively, you could just edit the message in your existing sources to include the actual source and destination pathnames given to rename() --- it's in src/backend/access/transam/xlog.c, line 1396 in 7.1.3.) That will allow us to eliminate the faint possibility that the code is somehow miscomputing the pathnames occasionally. However, given that you state a system reboot is necessary and sufficient to make the problem go away, I am going to stick my neck *way* out and suggest that: 1. You have the $PGDATA directory (or at least its pg_xlog subdirectory) mounted via NFS. 2. This is an NFS problem. In my book, no adequately-paranoid DBA will trust his database to NFS. There are some cautionary tales in our mailing list archives... regards, tom lane
James Thornton <thornton@cs.ecs.baylor.edu> writes: > I am not running NFS on this system. Oh well, scratch that theory. Perhaps you should tell us what you *are* running --- what OS, what hardware? I still believe that this must be a system-level bug and not directly Postgres' fault. regards, tom lane
Re: FATAL 2: InitRelink(logfile 0 seg 173) failed: No such file or directory
From
nield@usol.com
Date:
6/17/02 10:16:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >james@unifiedmind.com (James Thornton) writes: >> What does this mean, and what could be causing it? >> FATAL 2: InitRelink(logfile 0 seg 173) failed: No such file or >> directory >> That's the second time in as many months that I have received this >> error when trying to start postmaster after a crash -- both times a >> server reboot remedied the issue. > >That really should be impossible --- it says that a rename() failed for >a file we just created. > >I judge from the spelling of the error message that you are running 7.1. >I would recommend an update to 7.2, wherein the error message looks >more like this: > > if (rename(tmppath, path) < 0) > elog(STOP, "rename from %s to %s (initialization of log file %u, segment %u) failed: %m", > tmppath, path, log, seg); > [snip] From the xlog.c file in 7.3devel in InstallXLogFileSegment(), look at the code near: > while ((fd = BasicOpenFile(path, O_RDWR | PG_BINARY, > S_IRUSR | S_IWUSR)) >= 0) It would seem like we assume that ANY failure of BasicOpenFile() implies that 'path' does not exist. So then we don't handle any other cases, and rename might fail because 'path' actually exists. What if BasicOpenFile() got some other error? This would seem to be wrong, but it still doesn't explain why BasicOpenFile() would be failing when 'path' exists in this particular case. I don't have the 7.1 or 7.2 code around, and I've never looked at it. J.R. Nield nield@usol.com
Tom Lane wrote: > > That really should be impossible --- it says that a rename() failed for > a file we just created. > > I judge from the spelling of the error message that you are running 7.1. 7.1.3 > However, given that you state a system reboot is necessary and > sufficient to make the problem go away, I am going to stick my neck > *way* out and suggest that: > > 1. You have the $PGDATA directory (or at least its pg_xlog subdirectory) > mounted via NFS. > > 2. This is an NFS problem. I am not running NFS on this system.
Tom Lane wrote: > > James Thornton <thornton@cs.ecs.baylor.edu> writes: > > I am not running NFS on this system. > > Oh well, scratch that theory. Perhaps you should tell us what you *are* > running --- what OS, what hardware? I still believe that this must be > a system-level bug and not directly Postgres' fault. [nsadmin@roam proc]$ cat version cpuinfo meminfo pci Linux version 2.4.7-10smp (bhcompile@stripples.devel.redhat.com) (gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-98)) #1 SMP Thu Sep 6 17:09:31 EDT 2001 processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 7 model name : Pentium III (Katmai) stepping : 3 cpu MHz : 548.324 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1094.45 total: used: free: shared: buffers: cached: Mem: 327278592 321400832 5877760 720896 10825728 52867072 Swap: 271392768 13783040 257609728 MemTotal: 319608 kB MemFree: 5740 kB MemShared: 704 kB Buffers: 10572 kB Cached: 39552 kB SwapCached: 12076 kB Active: 21956 kB Inact_dirty: 40668 kB Inact_clean: 280 kB Inact_target: 480 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 319608 kB LowFree: 5740 kB SwapTotal: 265032 kB SwapFree: 251572 kB NrSwapPages: 62893 pages PCI devices found: Bus 0, device 0, function 0: Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge (rev 3). Master Capable. Latency=64. Prefetchable 32 bit memory at 0xf0000000 [0xf3ffffff]. Bus 0, device 1,function 0: PCI bridge: Intel Corporation 440BX/ZX - 82443BX/ZX AGP bridge (rev 3). Master Capable. Latency=64. Min Gnt=136. Bus 0, device 7, function 0: ISA bridge: Intel Corporation 82371ABPIIX4 ISA (rev 2). Bus 0, device 7, function 1: IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 1). Master Capable. Latency=32. I/O at 0x1000 [0x100f]. Bus 0, device 7, function 2: USB Controller: IntelCorporation 82371AB PIIX4 USB (rev 1). IRQ 14. Master Capable. Latency=64. I/O at 0xdce0 [0xdcff]. Bus 0, device 7, function 3: Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 2). IRQ 9. Bus 0, device 14, function 0: Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 4). IRQ 11. Master Capable. Latency=64. Min Gnt=8.Max Lat=56. Prefetchable 32 bit memory at 0xf7000000 [0xf7000fff]. I/O at 0xdcc0 [0xdcdf]. Non-prefetchable 32 bit memory at 0xff000000 [0xff0fffff]. Bus 0, device 15,function 0: PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 3). Master Capable. Latency=64. MinGnt=2. Bus 0, device 17, function 0: Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 36). IRQ 14. Master Capable. Latency=64. Min Gnt=10.Max Lat=10. I/O at 0xdc00 [0xdc7f]. Non-prefetchable32 bit memory at 0xff100000 [0xff10007f]. Bus 1, device 0, function 0: VGA compatible controller: ATITechnologies Inc 3D Rage Pro AGP 1X/2X (rev 92). IRQ 9. Master Capable. Latency=64. Min Gnt=8. Non-prefetchable 32 bit memory at 0xfd000000[0xfdffffff]. I/O at 0xfc00 [0xfcff]. Non-prefetchable 32 bit memory at 0xfcfff000 [0xfcffffff]. Bus 2,device 9, function 0: Unknown mass storage controller: Promise Technology, Inc. 20262 (rev 1). IRQ 9. Master Capable. Latency=64. I/O at 0xecf8 [0xecff]. I/O at 0xecf0 [0xecf3]. I/O at 0xece0[0xece7]. I/O at 0xecd8 [0xecdb]. I/O at 0xec80 [0xecbf]. Non-prefetchable 32 bit memory at 0xfafe0000[0xfaffffff].
nield@usol.com writes: > What if BasicOpenFile() got some other error? Doesn't really matter; anything else would be a problem we can't recover from anyhow. Besides, given that rename is failing with ENOENT, a conflict on the destination name does not appear to be the issue. regards, tom lane