Thread: Fwd: 8.0 Beta3 worked, RC1 didn't!
Forwarding the attached in case anyone missed it on -general. The shmem attach address shown in his messages (00DC0000) seems mighty low. What I am suspecting is: 1. Postmaster boots, creates shmem, and for some idiotic reason 2003 Server creates the shmem segment just above the end of regular memory. 2. When subprocesses launch and re-read GUC settings, for one reason or another they use up a little more RAM than the postmaster did. 3. Subprocesses fail to attach to shmem because the target address is now in their regular RAM range. I don't know why 2003 Server has such a brain-dead choice of shmem address assignment, nor why listen_addresses might prompt a little extra growth of RAM usage. But the theory seems to fit the available facts. If this is correct then we have to do something to force a smarter choice of shmem address on Windows. One brute-force way to do it might be to malloc a couple hundred K just before the postmaster attaches to shmem, and then release? Theory B is that somehow UsedShmemSegAddr is not being passed down accurately in this case, but that seems a mite improbable. regards, tom lane ------- Forwarded Message Date: 23 Dec 2004 08:33:12 -0800 From: nico@def2shoot.com (Nicolas COUSSEMACQ) To: pgsql-general@postgresql.org Subject: [GENERAL] 8.0 Beta3 worked, RC1 didn't! I have the same problem ! When I setup Postgres 8.0 Beta 4 on a Windows Xp or 2003 Server, it works parfectly with parameter listen_adresses set to '*' or localhost. I have been testing Beta5, RC1 and RC2 on my XP workstation and there is no problem, event if I accept external connections ( listen_adresses = '*'). Then I tried to setup Beta5, RC1 or RC2 on a station with 2003 Server, I can only acces the Database when listen_adresses = localhost. If i set listen_adresses = '*', i have a connection problem in PgAdmin saying "Could not recieve server response to SSL negociation packet : Connection reset by peer (0X00002746/10054). It appends when I launch pgadmin directly logged on the station, when i'm connected with remote access and even from my XP workstation. The log file contains many lines such these ones : 2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument 2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument 2004-12-23 16:55:17 LOG: background writer process (PID 680) exited with exit code 0 2004-12-23 16:55:17 LOG: terminating any other active server processes 2004-12-23 16:55:17 LOG: all server processes terminated; reinitializing If I switch the listen_addresses parameter back to localhost', I can connect to the DB in PgAdmin from the server screen or remote acces. Those these information help you ? ""A. Mous"" <a.mous@shaw.ca> a �crit dans le message de news:000801c4e7d1$058c5300$6500a8c0@PETER... > Hi all, > > I'm using psql 8.0.0 on a client's site who's running win server 2003. > We've had him on beta 3 for some time, and no problems at all (yes, in a > sense, he is a beta tester as well, but doesn't know it!). Today I tried to > upgrade the db to RC1 and had some problems. > > Remote clients connect to this database, so I have to set listen_addresses = > '*' in the posrgresql.conf file. This is the only change to the config > file. Doing this with RC1 and trying to connect locally with through psql > resulted in the following error message: > > "could not receive server response to SSL negotiation packet; connection > reset by peer (0x00002746/10054)" > > Removing the modified line in the config file resolved the problem > (locally), however, no clients can connect! Beta 3 does not seem to have > this issue, so we had to revert back to it for now. > > I would appreciate any ideas that some of you may have. Much thanks, > > -Peter > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match > ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings ------- End of Forwarded Message
AFAIK Win32 does not care where in private process address space the "shared memory" segment is. It can be mapped to different addresses in different processes and still share the same physical address space. This is why Win32 puts the private shared address anywhere in its own address space, because it doesn't matter. All that is needed is to create a *named* memory mapped segment of a particular size and get other process to map to the same name for the same memory segment size and it automagically works. If you try to force it to any particular private process address you may fail as you don't always know where program code (DLLs etc.) may be loaded. Cheers, Gary. Tom Lane wrote: > Forwarding the attached in case anyone missed it on -general. > > The shmem attach address shown in his messages (00DC0000) seems mighty > low. What I am suspecting is: > 1. Postmaster boots, creates shmem, and for some idiotic reason > 2003 Server creates the shmem segment just above the end of > regular memory. > 2. When subprocesses launch and re-read GUC settings, for one > reason or another they use up a little more RAM than the > postmaster did. > 3. Subprocesses fail to attach to shmem because the target > address is now in their regular RAM range. > > I don't know why 2003 Server has such a brain-dead choice of shmem > address assignment, nor why listen_addresses might prompt a little extra > growth of RAM usage. But the theory seems to fit the available facts. > > If this is correct then we have to do something to force a smarter > choice of shmem address on Windows. One brute-force way to do it > might be to malloc a couple hundred K just before the postmaster > attaches to shmem, and then release? > > Theory B is that somehow UsedShmemSegAddr is not being passed down > accurately in this case, but that seems a mite improbable. > > regards, tom lane > > ------- Forwarded Message > > Date: 23 Dec 2004 08:33:12 -0800 > From: nico@def2shoot.com (Nicolas COUSSEMACQ) > To: pgsql-general@postgresql.org > Subject: [GENERAL] 8.0 Beta3 worked, RC1 didn't! > > I have the same problem ! > > When I setup Postgres 8.0 Beta 4 on a Windows Xp or 2003 Server, it works > parfectly with parameter listen_adresses set to '*' or localhost. > I have been testing Beta5, RC1 and RC2 on my XP workstation and there is no > problem, event if I accept external connections ( listen_adresses = '*'). > Then I tried to setup Beta5, RC1 or RC2 on a station with 2003 Server, I can > only acces the Database when listen_adresses = localhost. If i set > listen_adresses = '*', i have a connection problem in PgAdmin saying "Could > not recieve server response to SSL negociation packet : Connection reset by > peer (0X00002746/10054). It appends when I launch pgadmin directly logged on > the station, when i'm connected with remote access and even from my XP > workstation. > The log file contains many lines such these ones : > 2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed > address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument > 2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed > address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument > 2004-12-23 16:55:17 LOG: background writer process (PID 680) exited with > exit code 0 > 2004-12-23 16:55:17 LOG: terminating any other active server processes > 2004-12-23 16:55:17 LOG: all server processes terminated; reinitializing > > If I switch the listen_addresses parameter back to localhost', I can connect > to the DB in PgAdmin from the server screen or remote acces. > > > Those these information help you ? > > > ""A. Mous"" <a.mous@shaw.ca> a écrit dans le message de > news:000801c4e7d1$058c5300$6500a8c0@PETER... > >>Hi all, >> >>I'm using psql 8.0.0 on a client's site who's running win server 2003. >>We've had him on beta 3 for some time, and no problems at all (yes, in a >>sense, he is a beta tester as well, but doesn't know it!). Today I tried > > to > >>upgrade the db to RC1 and had some problems. >> >>Remote clients connect to this database, so I have to set listen_addresses > > = > >>'*' in the posrgresql.conf file. This is the only change to the config >>file. Doing this with RC1 and trying to connect locally with through psql >>resulted in the following error message: >> >>"could not receive server response to SSL negotiation packet; connection >>reset by peer (0x00002746/10054)" >> >>Removing the modified line in the config file resolved the problem >>(locally), however, no clients can connect! Beta 3 does not seem to have >>this issue, so we had to revert back to it for now. >> >>I would appreciate any ideas that some of you may have. Much thanks, >> >>-Peter >> >> >>---------------------------(end of broadcast)--------------------------- >>TIP 9: the planner will ignore your desire to choose an index scan if your >> joining column's datatypes do not match >> > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings > > ------- End of Forwarded Message > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match > >
Gary Doades <gpd@gpdnet.co.uk> writes: > AFAIK Win32 does not care where in private process address space the > "shared memory" segment is. It can be mapped to different addresses in > different processes and still share the same physical address space. > This is why Win32 puts the private shared address anywhere in its own > address space, because it doesn't matter. Win32 may not care, but we do. The shared memory segment must be mapped at the same address in every backend. > If you try to force it to any particular private process address you may > fail as you don't always know where program code (DLLs etc.) may be loaded. This is (or ought to be) irrelevant, because we are only talking about instances of a single executable. regards, tom lane
Tom Lane wrote: > Gary Doades <gpd@gpdnet.co.uk> writes: > >>AFAIK Win32 does not care where in private process address space the >>"shared memory" segment is. It can be mapped to different addresses in >>different processes and still share the same physical address space. >>This is why Win32 puts the private shared address anywhere in its own >>address space, because it doesn't matter. > > > Win32 may not care, but we do. The shared memory segment must be mapped > at the same address in every backend. Forgive me for not knowing the internals of postgres, but why? As long as all the shared memory is accessed from the same relative offsets from the private starting address it will refer to the same physical shared memory address and should work. Is this to maintain compatibility with the other platforms way of doing things, or the postgres internal architecture? If this is the case then your suggestion may be the only one, to artificially bump up the first free address and hope that it is enough. Seems a bit hit and miss though (probably more hit than miss) since it's not easily known what the extra allocation for the subsequent backends may be. >>If you try to force it to any particular private process address you may >>fail as you don't always know where program code (DLLs etc.) may be loaded. > > > This is (or ought to be) irrelevant, because we are only talking about > instances of a single executable. > Agreed, as long as you can't have code dynamically linked from one backend, but not another. Cheers, Gary.
Gary Doades <gpd@gpdnet.co.uk> writes: > Tom Lane wrote: >> Win32 may not care, but we do. The shared memory segment must be mapped >> at the same address in every backend. > Forgive me for not knowing the internals of postgres, but why? As long > as all the shared memory is accessed from the same relative offsets from > the private starting address it will refer to the same physical shared > memory address and should work. Because we use absolute addresses in many cases. There was once a convention of making everything relative to ShmemBase, but we've abandoned that for reasons of code simplicity (and to a lesser extent performance). There are still some places using relative offsets but they are gradually going away. We are not reversing that decision just because some flavors of Windows have stupid algorithms for assigning default shmem addresses. > If this is the case then your suggestion may be the only one, to > artificially bump up the first free address and hope that it is enough. > Seems a bit hit and miss though (probably more hit than miss) since it's > not easily known what the extra allocation for the subsequent backends > may be. The needed extra allocation should really be *zero*. Keep in mind that the intention of the EXEC_BACKEND code is to emulate the Unix case where backends are spawned by fork(). Therefore the state of the backend at the point where it needs to attach to shmem should really be hardly at all different from the state of the postmaster. I'm moderately interested to find out why changing listen_addresses seems to affect this, but on the strength of the available evidence I'd suspect it's a matter of just a few bytes that happens to exceed an allocation boundary. It might be that we could solve the problem by rethinking the order of operations --- maybe we should reattach to shared memory during restore_backend_variables, before the exec'd backend has had a chance to do much of anything. regards, tom lane
Tom Lane wrote: > Forwarding the attached in case anyone missed it on -general. > > The shmem attach address shown in his messages (00DC0000) seems mighty > low. What I am suspecting is: > 1. Postmaster boots, creates shmem, and for some idiotic reason > 2003 Server creates the shmem segment just above the end of > regular memory. > 2. When subprocesses launch and re-read GUC settings, for one > reason or another they use up a little more RAM than the > postmaster did. > 3. Subprocesses fail to attach to shmem because the target > address is now in their regular RAM range. > > I don't know why 2003 Server has such a brain-dead choice of shmem > address assignment, nor why listen_addresses might prompt a little extra > growth of RAM usage. But the theory seems to fit the available facts. > > If this is correct then we have to do something to force a smarter > choice of shmem address on Windows. One brute-force way to do it > might be to malloc a couple hundred K just before the postmaster > attaches to shmem, and then release? > > Theory B is that somehow UsedShmemSegAddr is not being passed down > accurately in this case, but that seems a mite improbable. I am confused. I thought we used a hard-coded location for shared memory on Win32. I thought it was 00xDB0000 something but I can't find any mention of that. Was it removed? Are we now starting the postgres.exe binary and assuming we can map to the same shared memory address as postmaster.exe? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I thought it was 00xDB0000 something but I can't find any mention of > that. Was it removed? Are we now starting the postgres.exe binary and > assuming we can map to the same shared memory address as postmaster.exe? Looks that way to me; and I think it considerably safer than using any hard-wired address. My current feeling is that the problem stems from waiting too long to reattach to shared memory, and that we ought to do that as soon as we can read the shmem address info from the temp file. Just had a thought ... is it possible that this problem was introduced by the recent changes to pass backend variables in shared memory instead of in a temp file? ISTM fairly possible that mapping that memory is going to interfere with where we need to map the main shared memory block. I see that it gets unmapped after being read, but maybe the damage is already done. regards, tom lane
I have tried to, and am unable to reproduce this on any of my 2003 machines. I have tried with both RC1 and RC2. For those who reported the problem: 1) To reproduce, I installed from the MSI installer and just changed the listen_address parameter. Did you change anything*else* in your configuration? In postgresql.conf or anywhere else in pg? 2) Does this happen in a freshly initdb:ed database, or only when there is data? Does this happen directly after server (service)startup, or does it require the database to be running for a while with connections/disconnections before it happens? 3) Do you have any non-OS software installed on the machine(s) that are showing this problem? 4) What's the value of shared_buffers in postgresql.conf? Tom, why is DC000000 so low? That's still 10Mb into the process, right? Granted, it's not high, but it's not *that* low. (A simpletest program with all parameters at default get it's first address allocated at 003D2438 for me. A freshly MapViewOfFile()dmemory ends up at 003f0000. If I go for a larger test block (such as 50Mb), the mapped memory is moved upto 004d0000. I get very simlar results on XP and 2003. There are unfortunatly several places in the shmem code that will return EINVAL. So there is currently no way to detect exactlywhere the problem is. What do you think of adding a couple of elog()s at each place to help identifying them? //Magnus >-----Original Message----- >From: pgsql-hackers-win32-owner@postgresql.org >[mailto:pgsql-hackers-win32-owner@postgresql.org] On Behalf Of Tom Lane >Sent: den 24 december 2004 16:01 >To: pgsql-hackers-win32@postgresql.org >Subject: [pgsql-hackers-win32] Fwd: 8.0 Beta3 worked, RC1 didn't! > > >Forwarding the attached in case anyone missed it on -general. > >The shmem attach address shown in his messages (00DC0000) seems mighty >low. What I am suspecting is: > 1. Postmaster boots, creates shmem, and for some idiotic reason > 2003 Server creates the shmem segment just above the end of > regular memory. > 2. When subprocesses launch and re-read GUC settings, for one > reason or another they use up a little more RAM than the > postmaster did. > 3. Subprocesses fail to attach to shmem because the target > address is now in their regular RAM range. > >I don't know why 2003 Server has such a brain-dead choice of shmem >address assignment, nor why listen_addresses might prompt a >little extra >growth of RAM usage. But the theory seems to fit the available facts. > >If this is correct then we have to do something to force a smarter >choice of shmem address on Windows. One brute-force way to do it >might be to malloc a couple hundred K just before the postmaster >attaches to shmem, and then release? > >Theory B is that somehow UsedShmemSegAddr is not being passed down >accurately in this case, but that seems a mite improbable. > > regards, tom lane > >------- Forwarded Message > >Date: 23 Dec 2004 08:33:12 -0800 >From: nico@def2shoot.com (Nicolas COUSSEMACQ) >To: pgsql-general@postgresql.org >Subject: [GENERAL] 8.0 Beta3 worked, RC1 didn't! > >I have the same problem ! > >When I setup Postgres 8.0 Beta 4 on a Windows Xp or 2003 >Server, it works >parfectly with parameter listen_adresses set to '*' or localhost. >I have been testing Beta5, RC1 and RC2 on my XP workstation >and there is no >problem, event if I accept external connections ( >listen_adresses = '*'). >Then I tried to setup Beta5, RC1 or RC2 on a station with 2003 >Server, I can >only acces the Database when listen_adresses = localhost. If i set >listen_adresses = '*', i have a connection problem in PgAdmin >saying "Could >not recieve server response to SSL negociation packet : >Connection reset by >peer (0X00002746/10054). It appends when I launch pgadmin >directly logged on >the station, when i'm connected with remote access and even from my XP >workstation. >The log file contains many lines such these ones : >2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed >address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument >2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed >address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument >2004-12-23 16:55:17 LOG: background writer process (PID 680) >exited with >exit code 0 >2004-12-23 16:55:17 LOG: terminating any other active server processes >2004-12-23 16:55:17 LOG: all server processes terminated; >reinitializing > >If I switch the listen_addresses parameter back to localhost', >I can connect >to the DB in PgAdmin from the server screen or remote acces. > > >Those these information help you ? > > >""A. Mous"" <a.mous@shaw.ca> a écrit dans le message de >news:000801c4e7d1$058c5300$6500a8c0@PETER... >> Hi all, >> >> I'm using psql 8.0.0 on a client's site who's running win >server 2003. >> We've had him on beta 3 for some time, and no problems at >all (yes, in a >> sense, he is a beta tester as well, but doesn't know it!). >Today I tried >to >> upgrade the db to RC1 and had some problems. >> >> Remote clients connect to this database, so I have to set >listen_addresses >= >> '*' in the posrgresql.conf file. This is the only change to >the config >> file. Doing this with RC1 and trying to connect locally >with through psql >> resulted in the following error message: >> >> "could not receive server response to SSL negotiation >packet; connection >> reset by peer (0x00002746/10054)" >> >> Removing the modified line in the config file resolved the problem >> (locally), however, no clients can connect! Beta 3 does not >seem to have >> this issue, so we had to revert back to it for now. >> >> I would appreciate any ideas that some of you may have. Much thanks, >> >> -Peter >> >> >> ---------------------------(end of >broadcast)--------------------------- >> TIP 9: the planner will ignore your desire to choose an >index scan if your >> joining column's datatypes do not match >> > >---------------------------(end of >broadcast)--------------------------- >TIP 7: don't forget to increase your free space map settings > >------- End of Forwarded Message > > >---------------------------(end of >broadcast)--------------------------- >TIP 9: the planner will ignore your desire to choose an index >scan if your > joining column's datatypes do not match >
1) I checked the option in the setup program that allow connection from all client workstation, and added one line in pg_hba.conf ('host all all 10.0.0.0/8 password'). When I setup postgres without checking this option, it runs perfectly from localhost but when i active 'external connections', it fails... 2) I tried to setup with and without data from previous installed postgres. I think that the problem is immediate because I get a message during the installation explaining that the setup programm can not contact the database server ( I think that it happens when installing PL/PGSQL ...). 3) I tried to setup Postgress beta5, RC1 and RC1 on two servers : one was clean, it had just been running Beta4 for a few days, and the other was hosting my old Mysql Database. I got the same problem in all case. 4) shared_buffers = 1000 ----- Original Message ----- From: "Magnus Hagander" <mha@sollentuna.net> To: "Tom Lane" <tgl@sss.pgh.pa.us>; <pgsql-hackers-win32@postgresql.org> Cc: <nico@def2shoot.com> Sent: Monday, December 27, 2004 7:53 PM Subject: RE: [pgsql-hackers-win32] Fwd: 8.0 Beta3 worked, RC1 didn't! I have tried to, and am unable to reproduce this on any of my 2003 machines. I have tried with both RC1 and RC2. For those who reported the problem: 1) To reproduce, I installed from the MSI installer and just changed the listen_address parameter. Did you change anything *else* in your configuration? In postgresql.conf or anywhere else in pg? 2) Does this happen in a freshly initdb:ed database, or only when there is data? Does this happen directly after server (service) startup, or does it require the database to be running for a while with connections/disconnections before it happens? 3) Do you have any non-OS software installed on the machine(s) that are showing this problem? 4) What's the value of shared_buffers in postgresql.conf? Tom, why is DC000000 so low? That's still 10Mb into the process, right? Granted, it's not high, but it's not *that* low. (A simple test program with all parameters at default get it's first address allocated at 003D2438 for me. A freshly MapViewOfFile()d memory ends up at 003f0000. If I go for a larger test block (such as 50Mb), the mapped memory is moved up to 004d0000. I get very simlar results on XP and 2003. There are unfortunatly several places in the shmem code that will return EINVAL. So there is currently no way to detect exactly where the problem is. What do you think of adding a couple of elog()s at each place to help identifying them? //Magnus >-----Original Message----- >From: pgsql-hackers-win32-owner@postgresql.org >[mailto:pgsql-hackers-win32-owner@postgresql.org] On Behalf Of Tom Lane >Sent: den 24 december 2004 16:01 >To: pgsql-hackers-win32@postgresql.org >Subject: [pgsql-hackers-win32] Fwd: 8.0 Beta3 worked, RC1 didn't! > > >Forwarding the attached in case anyone missed it on -general. > >The shmem attach address shown in his messages (00DC0000) seems mighty >low. What I am suspecting is: > 1. Postmaster boots, creates shmem, and for some idiotic reason > 2003 Server creates the shmem segment just above the end of > regular memory. > 2. When subprocesses launch and re-read GUC settings, for one > reason or another they use up a little more RAM than the > postmaster did. > 3. Subprocesses fail to attach to shmem because the target > address is now in their regular RAM range. > >I don't know why 2003 Server has such a brain-dead choice of shmem >address assignment, nor why listen_addresses might prompt a >little extra >growth of RAM usage. But the theory seems to fit the available facts. > >If this is correct then we have to do something to force a smarter >choice of shmem address on Windows. One brute-force way to do it >might be to malloc a couple hundred K just before the postmaster >attaches to shmem, and then release? > >Theory B is that somehow UsedShmemSegAddr is not being passed down >accurately in this case, but that seems a mite improbable. > > regards, tom lane > >------- Forwarded Message > >Date: 23 Dec 2004 08:33:12 -0800 >From: nico@def2shoot.com (Nicolas COUSSEMACQ) >To: pgsql-general@postgresql.org >Subject: [GENERAL] 8.0 Beta3 worked, RC1 didn't! > >I have the same problem ! > >When I setup Postgres 8.0 Beta 4 on a Windows Xp or 2003 >Server, it works >parfectly with parameter listen_adresses set to '*' or localhost. >I have been testing Beta5, RC1 and RC2 on my XP workstation >and there is no >problem, event if I accept external connections ( >listen_adresses = '*'). >Then I tried to setup Beta5, RC1 or RC2 on a station with 2003 >Server, I can >only acces the Database when listen_adresses = localhost. If i set >listen_adresses = '*', i have a connection problem in PgAdmin >saying "Could >not recieve server response to SSL negociation packet : >Connection reset by >peer (0X00002746/10054). It appends when I launch pgadmin >directly logged on >the station, when i'm connected with remote access and even from my XP >workstation. >The log file contains many lines such these ones : >2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed >address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument >2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed >address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument >2004-12-23 16:55:17 LOG: background writer process (PID 680) >exited with >exit code 0 >2004-12-23 16:55:17 LOG: terminating any other active server processes >2004-12-23 16:55:17 LOG: all server processes terminated; >reinitializing > >If I switch the listen_addresses parameter back to localhost', >I can connect >to the DB in PgAdmin from the server screen or remote acces. > > >Those these information help you ? > > >""A. Mous"" <a.mous@shaw.ca> a écrit dans le message de >news:000801c4e7d1$058c5300$6500a8c0@PETER... >> Hi all, >> >> I'm using psql 8.0.0 on a client's site who's running win >server 2003. >> We've had him on beta 3 for some time, and no problems at >all (yes, in a >> sense, he is a beta tester as well, but doesn't know it!). >Today I tried >to >> upgrade the db to RC1 and had some problems. >> >> Remote clients connect to this database, so I have to set >listen_addresses >= >> '*' in the posrgresql.conf file. This is the only change to >the config >> file. Doing this with RC1 and trying to connect locally >with through psql >> resulted in the following error message: >> >> "could not receive server response to SSL negotiation >packet; connection >> reset by peer (0x00002746/10054)" >> >> Removing the modified line in the config file resolved the problem >> (locally), however, no clients can connect! Beta 3 does not >seem to have >> this issue, so we had to revert back to it for now. >> >> I would appreciate any ideas that some of you may have. Much thanks, >> >> -Peter >> >> >> ---------------------------(end of >broadcast)--------------------------- >> TIP 9: the planner will ignore your desire to choose an >index scan if your >> joining column's datatypes do not match >> > >---------------------------(end of >broadcast)--------------------------- >TIP 7: don't forget to increase your free space map settings > >------- End of Forwarded Message > > >---------------------------(end of >broadcast)--------------------------- >TIP 9: the planner will ignore your desire to choose an index >scan if your > joining column's datatypes do not match >
"Magnus Hagander" <mha@sollentuna.net> writes: > Tom, > why is DC000000 so low? That's still 10Mb into the process, right? Granted, it's not high, but it's not *that* low. (Asimple test program with all parameters at default get it's first address allocated at 003D2438 for me. A freshly MapViewOfFile()dmemory ends up at 003f0000. If I go for a larger test block (such as 50Mb), the mapped memory is moved upto 004d0000. I get very simlar results on XP and 2003. The question is not whether it's "low", it's whether there's any daylight between the end of memory in a postmaster/backend image and where the shmem segment gets placed. On Unix, shmat() is supposed to leave a lot of room between the data break address and where it puts shmem, so that malloc still has room to play in. I suspect that Windows is willing to malloc() memory above the shmem segment and so thinks that it doesn't need to leave any daylight there, other than rounding off to a page boundary for hardware reasons. If the backend process malloc's a bit more space than the postmaster did before trying to attach, we got trouble. It's not clear to me exactly *why* the backend would allocate any more space than the postmaster did, but that's my working hypothesis at the moment. regards, tom lane
Tom Lane wrote: > "Magnus Hagander" <mha@sollentuna.net> writes: > > Tom, > > why is DC000000 so low? That's still 10Mb into the process, right? Granted, it's not high, but it's not *that* low. (Asimple test program with all parameters at default get it's first address allocated at 003D2438 for me. A freshly MapViewOfFile()dmemory ends up at 003f0000. If I go for a larger test block (such as 50Mb), the mapped memory is moved upto 004d0000. I get very simlar results on XP and 2003. > > The question is not whether it's "low", it's whether there's any > daylight between the end of memory in a postmaster/backend image and > where the shmem segment gets placed. > > On Unix, shmat() is supposed to leave a lot of room between the data > break address and where it puts shmem, so that malloc still has room to > play in. I suspect that Windows is willing to malloc() memory above the > shmem segment and so thinks that it doesn't need to leave any daylight > there, other than rounding off to a page boundary for hardware reasons. > If the backend process malloc's a bit more space than the postmaster did > before trying to attach, we got trouble. > > It's not clear to me exactly *why* the backend would allocate any more > space than the postmaster did, but that's my working hypothesis at the > moment. What if we malloc 100k just before we create the postmaster segment and then free it and see if that fixes the postgres.exe problem? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > What if we malloc 100k just before we create the postmaster segment and > then free it and see if that fixes the postgres.exe problem? That was suggested already. As a permanent fix it's certainly unspeakably ugly, but it would be useful to try it just to prove (or disprove) that we understand the problem. It would probably be a good idea to make the padding at least 256K, since the numbers that have been tossed around seem to indicate that Windows may be aligning things on 128K boundaries. My inclination for a permanent fix would be to try to do the shmat() much earlier, but I don't think we should go to the effort of doing that code rearrangement until we've proven that this is indeed the issue. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > What if we malloc 100k just before we create the postmaster segment and > > then free it and see if that fixes the postgres.exe problem? > > That was suggested already. As a permanent fix it's certainly > unspeakably ugly, but it would be useful to try it just to prove > (or disprove) that we understand the problem. > > It would probably be a good idea to make the padding at least 256K, > since the numbers that have been tossed around seem to indicate that > Windows may be aligning things on 128K boundaries. > > My inclination for a permanent fix would be to try to do the shmat() > much earlier, but I don't think we should go to the effort of doing > that code rearrangement until we've proven that this is indeed the > issue. Right. Merlin, I added you to this email. Can you test that? Do you need us to send you a patch for testing? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
>Bruce Momjian <pgman@candle.pha.pa.us> writes: >> What if we malloc 100k just before we create the postmaster >segment and >> then free it and see if that fixes the postgres.exe problem? > >That was suggested already. As a permanent fix it's certainly >unspeakably ugly, but it would be useful to try it just to prove >(or disprove) that we understand the problem. > >It would probably be a good idea to make the padding at least 256K, >since the numbers that have been tossed around seem to indicate that >Windows may be aligning things on 128K boundaries. > >My inclination for a permanent fix would be to try to do the shmat() >much earlier, but I don't think we should go to the effort of doing >that code rearrangement until we've proven that this is indeed the >issue. Still unable to reproduce this, even with the more detailed steps in Nicolas mail. However, I've created a postgres.exe based on cvs-as-of-yesterday plus the attached patch for testing. The file is available on http://www.hagander.net/pgsql/postgres_shmem.zip Nicolas and Merlin - can you test with this .exe, please? You need to replace *both* postmaster.exe *and* postgres.exe with the new one. //Magnus
Attachment
> >> What if we malloc 100k just before we create the postmaster > >segment and > >> then free it and see if that fixes the postgres.exe problem? > > > >That was suggested already. As a permanent fix it's certainly > >unspeakably ugly, but it would be useful to try it just to prove (or > >disprove) that we understand the problem. > > > >It would probably be a good idea to make the padding at least 256K, > >since the numbers that have been tossed around seem to indicate that > >Windows may be aligning things on 128K boundaries. > > > >My inclination for a permanent fix would be to try to do the shmat() > >much earlier, but I don't think we should go to the effort of doing > >that code rearrangement until we've proven that this is indeed the > >issue. > > > Still unable to reproduce this, even with the more detailed > steps in Nicolas mail. However, I've created a postgres.exe > based on cvs-as-of-yesterday plus the attached patch for testing. > > The file is available on > http://www.hagander.net/pgsql/postgres_shmem.zip > > > Nicolas and Merlin - can you test with this .exe, please? You > need to replace *both* postmaster.exe *and* postgres.exe with > the new one. I've now had confirmation from one person (Edgars) that this solves his problem. I'd like confirmation from at least one more, but things point towards this being the reason. Tom - what's next? Do we want to roll RC3 with this ugly fix, or do we want to look at a better fix right away? One thought - what if we hard-code the address to somewhere at the 1Gb limit? That would limit us to 1Gb of shared buffers (or 2Gb if started witht he /3G switch to give user programs 3Gb in windows), but I don't see *anybody* needing 1Gb shared buffers... Or is that a bad idea? //Magnus
it works for me too. ----- Original Message ----- From: "Magnus Hagander" <mha@sollentuna.net> To: "Tom Lane" <tgl@sss.pgh.pa.us>; "Bruce Momjian" <pgman@candle.pha.pa.us> Cc: <pgsql-hackers-win32@postgresql.org>; <nico@def2shoot.com>; "Merlin Moncure" <merlin.moncure@rcsonline.com>; "Edgars Diebelis" <edgars.diebelis@divi.lv> Sent: Wednesday, December 29, 2004 10:42 AM Subject: RE: [pgsql-hackers-win32] Fwd: 8.0 Beta3 worked, RC1 didn't! > >> What if we malloc 100k just before we create the postmaster > >segment and > >> then free it and see if that fixes the postgres.exe problem? > > > >That was suggested already. As a permanent fix it's certainly > >unspeakably ugly, but it would be useful to try it just to prove (or > >disprove) that we understand the problem. > > > >It would probably be a good idea to make the padding at least 256K, > >since the numbers that have been tossed around seem to indicate that > >Windows may be aligning things on 128K boundaries. > > > >My inclination for a permanent fix would be to try to do the shmat() > >much earlier, but I don't think we should go to the effort of doing > >that code rearrangement until we've proven that this is indeed the > >issue. > > > Still unable to reproduce this, even with the more detailed > steps in Nicolas mail. However, I've created a postgres.exe > based on cvs-as-of-yesterday plus the attached patch for testing. > > The file is available on > http://www.hagander.net/pgsql/postgres_shmem.zip > > > Nicolas and Merlin - can you test with this .exe, please? You > need to replace *both* postmaster.exe *and* postgres.exe with > the new one. I've now had confirmation from one person (Edgars) that this solves his problem. I'd like confirmation from at least one more, but things point towards this being the reason. Tom - what's next? Do we want to roll RC3 with this ugly fix, or do we want to look at a better fix right away? One thought - what if we hard-code the address to somewhere at the 1Gb limit? That would limit us to 1Gb of shared buffers (or 2Gb if started witht he /3G switch to give user programs 3Gb in windows), but I don't see *anybody* needing 1Gb shared buffers... Or is that a bad idea? //Magnus
> I've now had confirmation from one person (Edgars) that this solves his > problem. I'd like confirmation from at least one more, but things point > towards this being the reason. > > Tom - what's next? Do we want to roll RC3 with this ugly fix, or do we > want to look at a better fix right away? > > One thought - what if we hard-code the address to somewhere at the 1Gb > limit? That would limit us to 1Gb of shared buffers (or 2Gb if started > witht he /3G switch to give user programs 3Gb in windows), but I don't > see *anybody* needing 1Gb shared buffers... Or is that a bad idea? > > //Magnus I can confirm the patched version fixes my busted win2k box. I was unable to get Magnus's compiled binary to work, maybe because I'm using gcc 3.4.1. Merlin
"Magnus Hagander" <mha@sollentuna.net> writes: > Tom - what's next? Do we want to roll RC3 with this ugly fix, or do we > want to look at a better fix right away? I think we want to look at a better fix right away; mainly because we need to test it to be sure that it really fixes the problem ;-). I will work on this today. regards, tom lane
> "Magnus Hagander" <mha@sollentuna.net> writes: >> Tom - what's next? Do we want to roll RC3 with this ugly fix, or do we >> want to look at a better fix right away? > I think we want to look at a better fix right away; mainly because we > need to test it to be sure that it really fixes the problem ;-). > I will work on this today. I have committed fixes that rearrange the code as I was envisioning. Things still seem to work when building with -DEXEC_BACKEND on Unix, but I'm not in a position to verify the Windows-specific code. Please give it a try ASAP. regards, tom lane
>>> Tom - what's next? Do we want to roll RC3 with this ugly >fix, or do we >>> want to look at a better fix right away? > >> I think we want to look at a better fix right away; mainly because we >> need to test it to be sure that it really fixes the problem ;-). >> I will work on this today. > >I have committed fixes that rearrange the code as I was envisioning. >Things still seem to work when building with -DEXEC_BACKEND on Unix, >but I'm not in a position to verify the Windows-specific code. Please >give it a try ASAP. It passes all regression tests for me. But since I didn't see the original problem, I can't confirm if it solves them. I have put up a new binary from cvs today on http://www.hagander.net/pgsql/postgres_shmem2.zip. For those of you who have the problem, as before, copy this file and overwrite both postgres.exe and postmaster.exe, then test and let us know if this one also fixes things. //Magnus