Thread: FATAL: could not reattach to shared memory (Win32)
Hello all, I am having problems with the next postgresql version: pg version: 8.2.4 OS: Win32 (windows xp sp2) FS: NTFS It is a production server, but suddenly the DB stop answering to any sql command. It seems dead. After restart server all starts to works again. I am looking for system errors and nothing is there. But I have a lot of messages on system APP errors. The error is the same every ten seconds or so. This is the main error: * FATAL: could not reattach to shared memory (key=5432001, addr=01D80000): Invalid argument It is always followed by this another system-app error: * LOG: unrecognized win32 error code: 487 I have found this on my intensive internet search: http://archives.postgresql.org/pgsql-bugs/2007-01/msg00032.php I need to solve this ASAP. Anybody have any idea about this ? Thanks.
Terry Yapt wrote: > I am looking for system errors and nothing is there. But I have a lot of > messages on system APP errors. The error is the same every ten seconds or > so. > > This is the main error: > * FATAL: could not reattach to shared memory (key=5432001, addr=01D80000): > Invalid argument Please run "ipcs" on a command line window and paste the results. I see a minor problem in that code: we are invoking two system calls (shmget and shmat) but the log does not say which one failed. However in this case it seems only shmget could be returning EINVAL. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Sorry, I have not be able to execute "ipcs" on windows. it doesn't exists. I have tried to find some utility that gives me the same information or any ipcs porting to win32, but I haven't had any luck. If I can do something more to get help, please tell me. Greetings. Alvaro Herrera escribió: > Terry Yapt wrote: > > >> I am looking for system errors and nothing is there. But I have a lot of >> messages on system APP errors. The error is the same every ten seconds or >> so. >> >> This is the main error: >> * FATAL: could not reattach to shared memory (key=5432001, addr=01D80000): >> Invalid argument >> > > Please run "ipcs" on a command line window and paste the results. > > I see a minor problem in that code: we are invoking two system calls > (shmget and shmat) but the log does not say which one failed. However > in this case it seems only shmget could be returning EINVAL. > >
Terry Yapt wrote: > This is the main error: > * FATAL: could not reattach to shared memory (key=5432001, addr=01D80000): > Invalid argument > > It is always followed by this another system-app error: > * LOG: unrecognized win32 error code: 487 FWIW, http://help.netop.com/support/errorcodes/win32_error_codes.htm says 487 Attempt to access invalid address. ERROR_INVALID_ADDRESS This problem has been reported before, for example in http://bbs.chinaunix.net/thread-973003-1-1.html (not that I can read it very well) and http://lists.pgfoundry.org/pipermail/brasil-usuarios/20061127/003150.html No resolution seems to have been found. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera escribió: > Terry Yapt wrote: > > >> This is the main error: >> * FATAL: could not reattach to shared memory (key=5432001, addr=01D80000): >> Invalid argument >> >> It is always followed by this another system-app error: >> * LOG: unrecognized win32 error code: 487 >> > > This problem has been reported before, for example in > > http://bbs.chinaunix.net/thread-973003-1-1.html > (not that I can read it very well) > > and > > http://lists.pgfoundry.org/pipermail/brasil-usuarios/20061127/003150.html > > Yes, those are the same than here: http://archives.postgresql.org/pgsql-bugs/2007-01/msg00032.php > No resolution seems to have been found. > Then, I am very worried now. :-| Thanks Alvaro.
Alvaro Herrera wrote: > Terry Yapt wrote: > >> This is the main error: >> * FATAL: could not reattach to shared memory (key=5432001, addr=01D80000): >> Invalid argument >> >> It is always followed by this another system-app error: >> * LOG: unrecognized win32 error code: 487 > > FWIW, > http://help.netop.com/support/errorcodes/win32_error_codes.htm > > says > 487 Attempt to access invalid address. ERROR_INVALID_ADDRESS > > This problem has been reported before, for example in > > http://bbs.chinaunix.net/thread-973003-1-1.html > (not that I can read it very well) > > and > > http://lists.pgfoundry.org/pipermail/brasil-usuarios/20061127/003150.html > > No resolution seems to have been found. 8.3 will have a new way to deal with shared mem on win32. It's the same underlying tech, but we're no longer trying to squeeze it into an emulation of sysv. With a bit of luck, that'll help :-) //Magnus
Magnus Hagander wrote: > Alvaro Herrera wrote: > > No resolution seems to have been found. > > 8.3 will have a new way to deal with shared mem on win32. It's the same > underlying tech, but we're no longer trying to squeeze it into an > emulation of sysv. With a bit of luck, that'll help :-) So you're saying we won't fix this bug in 8.2? That seems unfortunate, given that 8.2 is still supposed to be supported on Windows. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>----- Original Message ---- >From: Magnus Hagander <magnus@hagander.net> >To: Alvaro Herrera <alvherre@commandprompt.com> >Cc: Terry Yapt <yapt@technovell.com>; pgsql-general@postgresql.org >Sent: Thursday, August 23, 2007 3:43:32 PM >Subject: Re: [GENERAL] FATAL: could not reattach to shared memory (Win32) > > >8.3 will have a new way to deal with shared mem on win32. It's the same >underlying tech, but we're no longer trying to squeeze it into an >emulation of sysv. With a bit of luck, that'll help :-) > >//Magnus > Wild guess on my part... could that error be the result of an attempt to map shared memory into a process at a fixed locationthat just happens to already be occupied by a dll that Windows had decided to relocate? Regards, Shelby Cain ____________________________________________________________________________________ Pinpoint customers who are looking for what you sell. http://searchmarketing.yahoo.com/
Alvaro Herrera <alvherre@commandprompt.com> writes: > Magnus Hagander wrote: >> 8.3 will have a new way to deal with shared mem on win32. It's the same >> underlying tech, but we're no longer trying to squeeze it into an >> emulation of sysv. With a bit of luck, that'll help :-) > So you're saying we won't fix this bug in 8.2? Well, we certainly aren't going to back-patch a major rewrite that (1) hasn't made it through beta testing, and (2) is not actually known to fix the bug. When and if those gating conditions stop being true, maybe we could consider a back-patch. But at the moment this is all speculation ... I counsel concentrating on finding out what's really happening on Terry's machine, before trying to guess whether we already have a fix written. regards, tom lane
Shelby Cain wrote: >> ----- Original Message ---- From: Magnus Hagander >> <magnus@hagander.net> To: Alvaro Herrera >> <alvherre@commandprompt.com> Cc: Terry Yapt <yapt@technovell.com>; >> pgsql-general@postgresql.org Sent: Thursday, August 23, 2007 >> 3:43:32 PM Subject: Re: [GENERAL] FATAL: could not reattach to >> shared memory (Win32) >> >> >> 8.3 will have a new way to deal with shared mem on win32. It's the >> same underlying tech, but we're no longer trying to squeeze it into >> an emulation of sysv. With a bit of luck, that'll help :-) >> >> //Magnus >> > > Wild guess on my part... could that error be the result of an attempt > to map shared memory into a process at a fixed location that just > happens to already be occupied by a dll that Windows had decided to > relocate? Not that wild a guess, really :-) I'd say it's a very good possibility - but I have no idea why it'd do that, since all backends load the same DLLs at that stage. //Magnus
On 8/23/07, Magnus Hagander <magnus@hagander.net> wrote: > Shelby Cain wrote: > > Wild guess on my part... could that error be the result of an attempt > > to map shared memory into a process at a fixed location that just > > happens to already be occupied by a dll that Windows had decided to > > relocate? > > Not that wild a guess, really :-) I'd say it's a very good possibility - > but I have no idea why it'd do that, since all backends load the same > DLLs at that stage. Not a valid assumption; you can't rely on consistent VM space among multiple [non-cloned] processes without a serious amount of effort. Anything can use that space, it's not just file views. Obviously it happens to work some of the time, but when it doesn't, it doesn't. I gather postgres depends on it being at the same address, and fixing that isn't trivial? If everything relevant is going through the intriguing internal_forkexec(), you could probably reserve address space there before resuming the thread. You'd want to combine this with picking address space that's less likely to be used before creating the shared memory section. (Actually, if you're doing that, you might as well just inject the backend variables too instead of going through the mapped file gymnastics.) Not a simple change, but would likely make this particular problem go away (assuming this is the problem). It's also the first time I've looked at the source, so perhaps I missed something.
"Trevor Talbot" <quension@gmail.com> writes: > I gather postgres depends on it being at the same address, and fixing that > isn't trivial? I haven't been following the rest of the thread so I'm not sure if this is important. But no, fixing that should be relatively trivial as there are already some configurations where it's not the case (the EXEC_BACKEND case I believe). The rest of the system uses a shared memory base pointer and references everything relative to that. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Trevor Talbot wrote: > On 8/23/07, Magnus Hagander <magnus@hagander.net> wrote: > > Shelby Cain wrote: > > > > Wild guess on my part... could that error be the result of an attempt > > > to map shared memory into a process at a fixed location that just > > > happens to already be occupied by a dll that Windows had decided to > > > relocate? > > > > Not that wild a guess, really :-) I'd say it's a very good possibility - > > but I have no idea why it'd do that, since all backends load the same > > DLLs at that stage. > > Not a valid assumption; you can't rely on consistent VM space among > multiple [non-cloned] processes without a serious amount of effort. > Anything can use that space, it's not just file views. Obviously it > happens to work some of the time, but when it doesn't, it doesn't. I > gather postgres depends on it being at the same address, and fixing > that isn't trivial? > > If everything relevant is going through the intriguing > internal_forkexec(), you could probably reserve address space there > before resuming the thread. You'd want to combine this with picking > address space that's less likely to be used before creating the shared > memory section. (Actually, if you're doing that, you might as well > just inject the backend variables too instead of going through the > mapped file gymnastics.) > > Not a simple change, but would likely make this particular problem go > away (assuming this is the problem). It's also the first time I've > looked at the source, so perhaps I missed something. I think this is accurate. When we created the Win32 native port there was a lot of concern about how to handle shared memory in a BACKEND_EXEC case, namely that postmaster children were not copies which had the same shared memory mappings, but rather were new processes that had to attach to shared memory at a fixed address. The WIN32 solution was to create the shared memory in the parent, and then pass that address value down to the children to use in attaching to the existing segment. We expected all sorts of problems with this but in fact it seemed to work fine (most of the time). As you can see it doesn't work 100% of the time, but it worked more reliabily than we expected. What we have been waiting for is someone who can recreate a failure so we can track down how to best make it 100% reliable, and as you can see, we haven't had a flood of problem reports to track this down. If you want to help make it 100% we will work with you to find the solution. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Gregory Stark wrote: > "Trevor Talbot" <quension@gmail.com> writes: > > > I gather postgres depends on it being at the same address, and fixing that > > isn't trivial? > > I haven't been following the rest of the thread so I'm not sure if this is > important. But no, fixing that should be relatively trivial as there are > already some configurations where it's not the case (the EXEC_BACKEND case I > believe). The rest of the system uses a shared memory base pointer and > references everything relative to that. This is inaccurate, I believe. The original Berkeley code did exec() for backends and hence allowed shared memory to be at different addresses for different backends, but we started using fork() and eliminated much of that capability for performance and clarify reasons, so right now all backends have to have shared memory at the same address, and changing this will not be simple. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
"Trevor Talbot" <quension@gmail.com> writes: > On 8/23/07, Magnus Hagander <magnus@hagander.net> wrote: >> Not that wild a guess, really :-) I'd say it's a very good possibility - >> but I have no idea why it'd do that, since all backends load the same >> DLLs at that stage. > Not a valid assumption; you can't rely on consistent VM space among > multiple [non-cloned] processes without a serious amount of effort. I'm not sure if you have a specific technical meaning of "clone" in mind here, but these processes are all executing the identical executable, and taking care to map the shmem early in execution *before* they load any DLLs. So it should work. Apparently, it *does* work for awhile for the OP, and then stops working, which is even odder. > I gather postgres depends on it being at the same address, and fixing > that isn't trivial? That's correct, and not having to change it is not negotiable --- finding a way to make this work was one of the gating factors that made it practical to have a Windows port at all. If you've got a specific suggestion for making it more reliable, we're all ears. regards, tom lane
Gregory Stark <stark@enterprisedb.com> writes: > "Trevor Talbot" <quension@gmail.com> writes: >> I gather postgres depends on it being at the same address, and fixing that >> isn't trivial? > I haven't been following the rest of the thread so I'm not sure if this is > important. But no, fixing that should be relatively trivial as there are > already some configurations where it's not the case (the EXEC_BACKEND case I > believe). The rest of the system uses a shared memory base pointer and > references everything relative to that. That hasn't been the case for quite a few years, and we're not going back. The pointer-to-offset-and-back gymnastics that that required were utterly destructive to code readability and maintainability, mainly because if everything stored in shmem data structures is an "offset" then you can't get any useful error checking from the compiler about how you are using the fields. It's like decreeing that every pointer must be declared "void *" and cast to something else when it's used. There are a few old bits of code that still use MAKE_PTR/MAKE_OFFSET, but I think it's mostly just that no one's bothered to rewrite the code for SHM_QUEUE linked lists. The vast majority of our shmem structures use regular pointers, and have for years. regards, tom lane
Tom Lane escribió: > There are a few old bits of code that still use MAKE_PTR/MAKE_OFFSET, > but I think it's mostly just that no one's bothered to rewrite the code > for SHM_QUEUE linked lists. The vast majority of our shmem structures > use regular pointers, and have for years. ... except that, not knowing that, I wrote part of the new autovac code using MAKE_PTR/OFFSET, and it needs to be rewritten eventually :-( -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > There are a few old bits of code that still use MAKE_PTR/MAKE_OFFSET, > but I think it's mostly just that no one's bothered to rewrite the code > for SHM_QUEUE linked lists. The vast majority of our shmem structures > use regular pointers, and have for years. Ah, I happened to be recently in that code so I was mislead. So even in EXEC_BACKEND we require that we can attach to the shared memory at a specified location. hm. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
>----- Original Message ---- >From: Magnus Hagander <magnus@hagander.net> >To: Shelby Cain <alyandon@yahoo.com> >Cc: Alvaro Herrera <alvherre@commandprompt.com>; Terry Yapt <yapt@technovell.com>; pgsql-general@postgresql.org >Sent: Friday, August 24, 2007 1:08:44 AM >Subject: Re: [GENERAL] FATAL: could not reattach to shared memory (Win32) > >Not that wild a guess, really :-) I'd say it's a very good possibility - >but I have no idea why it'd do that, since all backends load the same >DLLs at that stage. > >//Magnus > Assuming this is an issue with shared libraries, I think it would have more to do with the way Windows resolves address conflicts on process startup than anything caused by explicit calls to LoadLibrary(). Looking at postgres.exe with the dependency viewer from Visual Studio 6, I see that the following shared library dependencies embedded in the executable image that having conflicting base addresses. If I'm not mistaken, Windows will automatically relocate these libraries prior to actual code execution so there would be no opportunity for that particular instance of postgres.exe to map the shared memory if the address space is already in use by a relocated dll. libeay32.dll - 0x10000000 libiconv-2.dll - 0x10000000 libintl-2.dll - 0x10000000 ssleay32.dll - 0x10000000 comerr32.dll - 0x1c000000 krb5_32.dll - 0x1c000000 I also found a KB article that specifically addresses ERROR_INVALID_MEMORY being returned from MapViewOfFileEx(). http://support.microsoft.com/kb/125713 The article specifically addresses the concern where multiple processes must use the same address for mappings and how to accomplish that under Windows. Search for "Addresses of Mapped Views". The only thing that really gives me any pause is the fact the article hasn't been updated past the NT 3.51/Windows 9x era but the underlying behavior might not have been changed in Windows 2000/XP/etc. Regards, Shelby Cain ____________________________________________________________________________________ Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games. http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow
I apologize for resending this but my editor in combination with Yahoo's web mail interface horribly mangled it... >----- Original Message ---- >From: Magnus Hagander <magnus@hagander.net> >To: Shelby Cain <alyandon@yahoo.com> >Cc: Alvaro Herrera <alvherre@commandprompt.com>; Terry Yapt <yapt@technovell.com>; pgsql-general@postgresql.org >Sent: Friday, August 24, 2007 1:08:44 AM >Subject: Re: [GENERAL] FATAL: could not reattach to shared memory (Win32) > >Not that wild a guess, really :-) I'd say it's a very good possibility - >but I have no idea why it'd do that, since all backends load the same >DLLs at that stage. > >//Magnus > Assuming this is an issue with shared libraries, I think it would have more to do with the way Windows resolves address conflictson process startup than anything caused by explicit calls to LoadLibrary(). Looking at postgres.exe with the dependencyviewer from Visual Studio 6, I see that the following shared library dependencies embedded in the executable imagethat having conflicting base addresses. If I'm not mistaken, Windows will automatically relocate these libraries priorto actual code execution so there would be no opportunity for that particular instance of postgres.exe to map the sharedmemory if the address space is already in use by a relocated dll. libeay32.dll - 0x10000000 libiconv-2.dll - 0x10000000 libintl-2.dll - 0x10000000 ssleay32.dll - 0x10000000 comerr32.dll - 0x1c000000 krb5_32.dll - 0x1c000000 I also found a KB article that addresses ERROR_INVALID_MEMORY being returned from MapViewOfFileEx(). http://support.microsoft.com/kb/125713 The article specifically addresses the concern where multiple processes must use the same address for mappings and how toaccomplish that under Windows. Search for "Addresses of Mapped Views". The only thing that really gives me any pauseis the fact the article hasn't been updated past the NT 3.51/Windows 9x era but the underlying behavior might not havebeen changed in Windows 2000/XP/etc. Regards, Shelby Cain ____________________________________________________________________________________ Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games. http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow
Gregory Stark <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> There are a few old bits of code that still use MAKE_PTR/MAKE_OFFSET, >> but I think it's mostly just that no one's bothered to rewrite the code >> for SHM_QUEUE linked lists. The vast majority of our shmem structures >> use regular pointers, and have for years. > Ah, I happened to be recently in that code so I was mislead. IIRC, the reason for not bothering to change the SHM_QUEUE code (other than inertia) was that it's a generic linked list package, and so if it wasn't storing SHMEM_OFFSETs it'd be storing "void *"'s, and so there didn't seem to be any traction to be gained in terms of compiler error detection capability. However, if both you and Alvaro were confused about the liveness of that coding convention, maybe it'd be worth making a push to eliminate all trace of MAKE_PTR/MAKE_OFFSET. TODO for 8.4? regards, tom lane
Tom Lane escribió: > I'm not sure if you have a specific technical meaning of "clone" in mind > here, but these processes are all executing the identical executable, > and taking care to map the shmem early in execution *before* they load > any DLLs. So it should work. Apparently, it *does* work for awhile for > the OP, and then stops working, which is even odder. > > Yes, the windows system log (application log section) doesn't show any error in several days. Suddenly errors bring back to life and syslog errors repeats every few time. But again errors disappears and return in a few hours. After few hours the system goes out. Curiosity: ====== On the log lines I have and I sent to the list: * FATAL: could not reattach to shared memory (key=5432001, addr=01D80000): Invalid argument , this one: "addr=01D80000" is always the same in spite of the system have been shutting down and restarted or the error was out for a days. Greetings.
Shelby Cain <alyandon@yahoo.com> writes: > Assuming this is an issue with shared libraries, I think it would have more= > to do with the way Windows resolves address conflicts on process startup t= > han anything caused by explicit calls to LoadLibrary(). Looking at postgre= > s.exe with the dependency viewer from Visual Studio 6, I see that the follo= > wing shared library dependencies embedded in the executable image that havi= > ng conflicting base addresses. If I'm not mistaken, Windows will automatic= > ally relocate these libraries prior to actual code execution so there would= > be no opportunity for that particular instance of postgres.exe to map the = > shared memory if the address space is already in use by a relocated dll. But the shmem was originally allocated in the postmaster process, which is the identical executable with the identical set of linked-in DLLs. So it's really unclear why the child processes would be unable to reattach at the same address. regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Gregory Stark <stark@enterprisedb.com> writes: >> "Tom Lane" <tgl@sss.pgh.pa.us> writes: >>> There are a few old bits of code that still use MAKE_PTR/MAKE_OFFSET, >>> but I think it's mostly just that no one's bothered to rewrite the code >>> for SHM_QUEUE linked lists. The vast majority of our shmem structures >>> use regular pointers, and have for years. > >> Ah, I happened to be recently in that code so I was mislead. > > IIRC, the reason for not bothering to change the SHM_QUEUE code (other > than inertia) was that it's a generic linked list package, and so if > it wasn't storing SHMEM_OFFSETs it'd be storing "void *"'s, and so there > didn't seem to be any traction to be gained in terms of compiler error > detection capability. However, if both you and Alvaro were confused > about the liveness of that coding convention, maybe it'd be worth making > a push to eliminate all trace of MAKE_PTR/MAKE_OFFSET. TODO for 8.4? It would also make using gdb to look at the lock queues a bit less of a pain. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On 8/24/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Trevor Talbot" <quension@gmail.com> writes: > > On 8/23/07, Magnus Hagander <magnus@hagander.net> wrote: > >> Not that wild a guess, really :-) I'd say it's a very good possibility - > >> but I have no idea why it'd do that, since all backends load the same > >> DLLs at that stage. > > > Not a valid assumption; you can't rely on consistent VM space among > > multiple [non-cloned] processes without a serious amount of effort. > > I'm not sure if you have a specific technical meaning of "clone" in mind > here, but these processes are all executing the identical executable, > and taking care to map the shmem early in execution *before* they load > any DLLs. So it should work. Apparently, it *does* work for awhile for > the OP, and then stops working, which is even odder. "Clone" in the same sense as fork(): duplicating a process instead of regenerating it. Even ignoring things like DLL replacement and LD_PRELOAD-style options, there's still a lot of opportunity for dynamic behavior. All DLLs have an initialization routine called by the loader (and on thread creation), which tends to be used to set up things you don't want the caller to have to explicitly initialize. DLLs that maintain global state they share with copies of themselves in other processes can set up shared memory etc to do that. They can easily change their behavior based on the environment at the time of process start. There are also all the hooks for extension points, such as Winsock LSPs. Most such things happen only after an explicit initialization (e.g. WSAStartup() or socket creation in the Winsock case), but between the C runtime and third-party libraries, it may be happening when you don't expect it. All that said, I don't actually have a real-world example of process VM layout changing like this, especially since you are using it early to avoid this very problem. I'd love to find out exactly what's going on in Terry's case, but I haven't come up with a good way to do it that doesn't disturb his production environment. > If you've got a specific suggestion for making it more reliable, > we're all ears. To elaborate on what I said earlier, internal_forkexec() creates the process suspended; while it has an execution environment set up, the loader hasn't done all the DLL linking and initialization yet, so the address space is relatively untouched. At that point you could use VirtualAllocEx() to reserve VM space for the shared memory at the right address, and proceed with the rest of the setup. When the new backend starts up, it would then VirtualFree() that space immediately before calling MapViewOfFileEx() on it. I can probably set up with the 8.3 tree and MSVC to create an artificial failure, and play with the above as a fix, but I'm not quite sure when that will be. There's still the issue of verifying it is the problem on Terry's machine, and figuring out a fix for him. On 8/24/07, Terry Yapt <yapt@technovell.com> wrote: > Yes, the windows system log (application log section) doesn't show any > error in several days. Suddenly errors bring back to life and syslog > errors repeats every few time. But again errors disappears and return > in a few hours. After few hours the system goes out. > > Curiosity: > ====== > On the log lines I have and I sent to the list: * FATAL: could not > reattach to shared memory (key=5432001, addr=01D80000): Invalid argument > , this one: "addr=01D80000" is always the same in spite of the system > have been shutting down and restarted or the error was out for a days. The environment is consistent then. Whatever is going on, when postgres first starts things are normal, something just changes later and the change is temporary. As vague guides, I would look at some kind of global resource usage/tracking, and scheduled tasks. Do you see any patterns about WHEN this happens? During high load periods? Any antivirus or other security type tasks running on the machine? Any third-party VPN type software? Fast User Switching or Remote Desktop use?
Trevor Talbot escribió: > The environment is consistent then. Whatever is going on, when > postgres first starts things are normal, something just changes later > and the change is temporary. As vague guides, I would look at some > kind of global resource usage/tracking, and scheduled tasks. Do you > see any patterns about WHEN this happens? During high load periods? > Any antivirus or other security type tasks running on the machine? > Any third-party VPN type software? Fast User Switching or Remote > Desktop use? I have spent a lot of time looking for patterns on system logs, apache logs, postgres logs, etc... I have not found any clue conclusive. Only I can say I have this kind of errors on postgreSQL-Logs: '2007-08-21 15:19:21 ERROR: could not open relation 16692/16694/17295: Invalid argument' And next log line/s are the statement-X. But Statement-X runs ok and give me right results when I copy+paste on any sql editor connected to that DB. That errors are not 'linked on time' with FATAL errors we are speaking about on this thread. I am trying to get the opportunity to migrate that DB to another server and use that server to test anything we want, but the customer is reluctant to let me that server to try-test-errors process because that is their mail and web server too. :-( In spite of that server is remote far away from my location I have a console (UltraVNC) to it if you need something to looking for. Greetings.
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --------------------------------------------------------------------------- Magnus Hagander wrote: > Shelby Cain wrote: > >> ----- Original Message ---- From: Magnus Hagander > >> <magnus@hagander.net> To: Alvaro Herrera > >> <alvherre@commandprompt.com> Cc: Terry Yapt <yapt@technovell.com>; > >> pgsql-general@postgresql.org Sent: Thursday, August 23, 2007 > >> 3:43:32 PM Subject: Re: [GENERAL] FATAL: could not reattach to > >> shared memory (Win32) > >> > >> > >> 8.3 will have a new way to deal with shared mem on win32. It's the > >> same underlying tech, but we're no longer trying to squeeze it into > >> an emulation of sysv. With a bit of luck, that'll help :-) > >> > >> //Magnus > >> > > > > Wild guess on my part... could that error be the result of an attempt > > to map shared memory into a process at a fixed location that just > > happens to already be occupied by a dll that Windows had decided to > > relocate? > > Not that wild a guess, really :-) I'd say it's a very good possibility - > but I have no idea why it'd do that, since all backends load the same > DLLs at that stage. > > //Magnus > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian escribió: > This has been saved for the 8.4 release: > > http://momjian.postgresql.org/cgi-bin/pgpatches_hold Update: I have installed PostgreSQL 8.2.5 and move database from old to new server. This was 2 weeks ago. New Server is a Windows 2003 Server running other services too. Until now, this problem has gone out and PosgresSQL is running like a charm on the new server. :-) Greetings.
Added to TODO: * Remove use of MAKE_PTR and MAKE_OFFSET macros http://archives.postgresql.org/pgsql-general/2007-08/msg01510.php --------------------------------------------------------------------------- Tom Lane wrote: > Gregory Stark <stark@enterprisedb.com> writes: > > "Trevor Talbot" <quension@gmail.com> writes: > >> I gather postgres depends on it being at the same address, and fixing that > >> isn't trivial? > > > I haven't been following the rest of the thread so I'm not sure if this is > > important. But no, fixing that should be relatively trivial as there are > > already some configurations where it's not the case (the EXEC_BACKEND case I > > believe). The rest of the system uses a shared memory base pointer and > > references everything relative to that. > > That hasn't been the case for quite a few years, and we're not going back. > The pointer-to-offset-and-back gymnastics that that required were > utterly destructive to code readability and maintainability, mainly > because if everything stored in shmem data structures is an "offset" > then you can't get any useful error checking from the compiler about how > you are using the fields. It's like decreeing that every pointer > must be declared "void *" and cast to something else when it's used. > > There are a few old bits of code that still use MAKE_PTR/MAKE_OFFSET, > but I think it's mostly just that no one's bothered to rewrite the code > for SHM_QUEUE linked lists. The vast majority of our shmem structures > use regular pointers, and have for years. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Added to Win32 TODO: o Diagnose problem where shared memory can sometimes not be attached by postmaster children http://archives.postgresql.org/pgsql-general/2007-08/msg01377.php --------------------------------------------------------------------------- Magnus Hagander wrote: > Shelby Cain wrote: > >> ----- Original Message ---- From: Magnus Hagander > >> <magnus@hagander.net> To: Alvaro Herrera > >> <alvherre@commandprompt.com> Cc: Terry Yapt <yapt@technovell.com>; > >> pgsql-general@postgresql.org Sent: Thursday, August 23, 2007 > >> 3:43:32 PM Subject: Re: [GENERAL] FATAL: could not reattach to > >> shared memory (Win32) > >> > >> > >> 8.3 will have a new way to deal with shared mem on win32. It's the > >> same underlying tech, but we're no longer trying to squeeze it into > >> an emulation of sysv. With a bit of luck, that'll help :-) > >> > >> //Magnus > >> > > > > Wild guess on my part... could that error be the result of an attempt > > to map shared memory into a process at a fixed location that just > > happens to already be occupied by a dll that Windows had decided to > > relocate? > > Not that wild a guess, really :-) I'd say it's a very good possibility - > but I have no idea why it'd do that, since all backends load the same > DLLs at that stage. > > //Magnus > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +