Thread: 9.1Beta1 - Repeatable Crash on Windows
psql -U postgres psql (9.1beta1) WARNING: Console code page (437) differs from Windows code page (1252) 8-bit characters might not work correctly. See psql reference page "Notes for Windows users" for details. Type "help" for help. postgres=# SELECT 'INFINITY'::TIMESTAMP; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> I've also included this report on the google doc. -- Regards, Richard Broersma Jr.
On 5/9/11 2:23 PM, Richard Broersma wrote: > psql -U postgres > psql (9.1beta1) > WARNING: Console code page (437) differs from Windows code page (1252) > 8-bit characters might not work correctly. See psql reference > page "Notes for Windows users" for details. > Type "help" for help. > > postgres=# SELECT 'INFINITY'::TIMESTAMP; > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > !> > > > I've also included this report on the google doc. Thanks. Is the PostgreSQL server running on Windows here? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Yes. On Mon, May 9, 2011 at 2:34 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 5/9/11 2:23 PM, Richard Broersma wrote: >> psql -U postgres >> psql (9.1beta1) >> WARNING: Console code page (437) differs from Windows code page (1252) >> 8-bit characters might not work correctly. See psql reference >> page "Notes for Windows users" for details. >> Type "help" for help. >> >> postgres=# SELECT 'INFINITY'::TIMESTAMP; >> server closed the connection unexpectedly >> This probably means the server terminated abnormally >> before or while processing the request. >> The connection to the server was lost. Attempting reset: Failed. >> !> >> >> >> I've also included this report on the google doc. > > Thanks. Is the PostgreSQL server running on Windows here? > > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > - > HOWTO Alpha/Beta Test: > http://wiki.postgresql.org/wiki/HowToBetaTest > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-testers > -- Regards, Richard Broersma Jr.
On 5/9/11 2:47 PM, Richard Broersma wrote: > Yes. OK, let me see if I can make this happen on Linux or OSX. Have you tried? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Mon, May 9, 2011 at 2:48 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 5/9/11 2:47 PM, Richard Broersma wrote: > OK, let me see if I can make this happen on Linux or OSX. Have you tried? No, I don't have any computer's with these OS's available to me. -- Regards, Richard Broersma Jr.
On 05/09/2011 05:48 PM, Josh Berkus wrote: > On 5/9/11 2:47 PM, Richard Broersma wrote: > >> Yes. >> > OK, let me see if I can make this happen on Linux or OSX. Have you tried? > Works fine here on Linux: psql (9.1beta1) Type "help" for help. gsmith=# SELECT 'INFINITY'::TIMESTAMP; timestamp ----------- infinity (1 row) Richard, anything interesting in the server log after your crash? -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
On Mon, May 9, 2011 at 3:53 PM, Greg Smith <greg@2ndquadrant.com> wrote: > Richard, anything interesting in the server log after your crash? Here is what the logs show: 2011-05-09 07:49:37 PDT LOG: server process (PID 2848) was terminated by exception 0xC0000005 2011-05-09 07:49:37 PDT HINT: See C include file "ntstatus.h" for a description of the hexadecimal value. 2011-05-09 07:49:37 PDT LOG: terminating any other active server processes 2011-05-09 07:49:38 PDT WARNING: terminating connection because of crash of another server process 2011-05-09 07:49:38 PDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2011-05-09 07:49:38 PDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2011-05-09 07:49:38 PDT FATAL: the database system is in recovery mode 2011-05-09 07:49:38 PDT LOG: all server processes terminated; reinitializing 2011-05-09 07:49:48 PDT FATAL: pre-existing shared memory block is still in use 2011-05-09 07:49:48 PDT HINT: Check if there are any old server processes still running, and terminate them. -- Regards, Richard Broersma Jr.
Richard Broersma wrote: > Here is what the logs show: > > 2011-05-09 07:49:37 PDT LOG: server process (PID 2848) was terminated > by exception 0xC0000005 > Too bad, that's just a generic "accessed memory you shouldn't have" exception. Not much help narrowing down the source. That could be a driver or hardware issue, but since you say it's repeatable that seems less likely. At this point, fork in the road. If someone else can reproduce this on another Windows system, they may be able to run with it. But if you can spare some time to dig further, the instructions at http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows go over how to trace into where it's actually failing at yourself. If you run PostgreSQL on Windows, that's good defensive practice to fit in on a day it's not an emergency to do so. (The same is true of any platform, it just takes more time to setup on Windows) -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
-----Message d'origine----- De : pgsql-testers-owner@postgresql.org [mailto:pgsql-testers-owner@postgresql.org] De la part de Greg Smith Envoyé : 9 mai 2011 20:55 À : Richard Broersma Cc : pgsql-testers@postgresql.org Objet : Re: [TESTERS] 9.1Beta1 - Repeatable Crash on Windows Richard Broersma wrote: > Here is what the logs show: > > 2011-05-09 07:49:37 PDT LOG: server process (PID 2848) was terminated > by exception 0xC0000005 > Too bad, that's just a generic "accessed memory you shouldn't have" exception. Not much help narrowing down the source. That could be a driver or hardware issue, but since you say it's repeatable that seems less likely. At this point, fork in the road. If someone else can reproduce this on another Windows system, they may be able to run with it. But if you can spare some time to dig further, the instructions at http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows go over how to trace into where it's actually failing at yourself. If you run PostgreSQL on Windows, that's good defensive practice to fit in on a day it's not an emergency to do so. (The same is true of any platform, it just takes more time to setup on Windows) ------------------------ I can reproduce the same exception in another manner. I tried to get a stack trace but cannot seem to attach to the processusing either process explorer or WinDBG. I reproduced this using the current beta pgAdmin by right-clicking on LoginRoles, select New login role, Role name test, password testp, and selecting all role privliges then OK. The servicecrashes. PID 5432 below is the pgAdmin process. My log file (sorry about the French): 2011-05-10 08:00:26 EDT LOG: processus serveur (PID 5432) a été arrêté par l'exception 0xC0000005 2011-05-10 08:00:26 EDT ASTUCE : Voir le fichier d'en-tête C « ntstatus.h » pour une description de la valeur hexadécimale. 2011-05-10 08:00:26 EDT LOG: arrêt des autres processus serveur actifs 2011-05-10 08:00:26 EDT ATTENTION: arrêt de la connexion à cause de l'arrêt brutal d'un autre processus serveur 2011-05-10 08:00:26 EDT DÉTAIL: Le postmaster a commandé à ce processus serveur d'annuler la transaction courante et de quitter car un autre processus serveur a quitté anormalement et qu'il existe probablement de la mémoire partagée corrompue. 2011-05-10 08:00:26 EDT ASTUCE : Dans un moment, vous devriez être capable de vous reconnecter à la base de données et de relancer votre commande. 2011-05-10 08:00:26 EDT ATTENTION: arrêt de la connexion à cause de l'arrêt brutal d'un autre processus serveur 2011-05-10 08:00:26 EDT DÉTAIL: Le postmaster a commandé à ce processus serveur d'annuler la transaction courante et de quitter car un autre processus serveur a quitté anormalement et qu'il existe probablement de la mémoire partagée corrompue. 2011-05-10 08:00:26 EDT ASTUCE : Dans un moment, vous devriez être capable de vous reconnecter à la base de données et de relancer votre commande. 2011-05-10 08:00:26 EDT LOG: tous les processus serveur se sont arrêtés, réinitialisation 2011-05-10 08:00:36 EDT FATAL: le bloc de mémoire partagé pré-existant est toujours en cours d'utilisation 2011-05-10 08:00:36 EDT ASTUCE : Vérifier s'il n'y a pas de vieux processus serveur en cours d'exécution. Si c'est le cas, fermez-les. - Mark Watson
On Mon, May 9, 2011 at 5:55 PM, Greg Smith <greg@2ndquadrant.com> wrote: > But if you can > spare some time to dig further, the instructions at > http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows > go over how to trace into where it's actually failing at yourself. If you > run PostgreSQL on Windows, that's good defensive practice to fit in on a day > it's not an emergency to do so. (The same is true of any platform, it just > takes more time to setup on Windows) These following is the stack trace taken using the instruction for "Getting a stack trace of a repeatable backend crash." 0:003> G (4ac.158): Access violation - code c0000005 (first chance) First chance exceptions are reported before any exception handling. This exception may be expected and handled. eax=ffffffff ebx=fffffff4 ecx=ffffffff edx=fffffffd esi=00000000 edi=00000000 eip=0063b890 esp=00d4f614 ebp=00d4f7c0 iopl=0 nv up ei ng nz na pe cy cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010287 postgres!datebsearch+0x30: 0063b890 0fbe0c97 movsx ecx,byte ptr [edi+edx*4] ds:0023:fffffff4=?? 0:000> ~*k . 0 Id: 4ac.158 Suspend: 1 Teb: 7ffdf000 Unfrozen ChildEBP RetAddr 00d4f620 0063d738 postgres!datebsearch+0x30 [c:\pginstaller-repo\postgres.windows\src\backend\utils\adt\datetime.c @ 3579] 00d4f634 0063e824 postgres!DecodeSpecial+0x38 [c:\pginstaller-repo\postgres.windows\src\backend\utils\adt\datetime.c @ 2789] 00d4f688 006a62a9 postgres!DecodeDateTime+0x654 [c:\pginstaller-repo\postgres.windows\src\backend\utils\adt\datetime.c @ 1173] 00d4f818 006df4ce postgres!timestamp_in+0x79 [c:\pginstaller-repo\postgres.windows\src\backend\utils\adt\timestamp.c @ 161] 00d4fa3c 006e0fc3 postgres!InputFunctionCall+0xae [c:\pginstaller-repo\postgres.windows\src\backend\utils\fmgr\fmgr.c @ 1909] 00d4fa7c 005b66f7 postgres!OidInputFunctionCall+0x33 [c:\pginstaller-repo\postgres.windows\src\backend\utils\fmgr\fmgr.c @ 2041] 00d4fa98 005a4d3d postgres!stringTypeDatum+0x27 [c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_type.c @ 586] 00d4fad4 005a437e postgres!coerce_type+0x16d [c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_coerce.c @ 253] 00d4fb08 005a915a postgres!coerce_to_target_type+0x4e [c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_coerce.c @ 90] 00d4fb40 005aaa96 postgres!transformTypeCast+0x6a [c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_expr.c @ 2105] 00d4fb58 005b5abc postgres!transformExpr+0x326 [c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_expr.c @ 187] 00d4fb7c 00585ac4 postgres!transformTargetList+0xdc [c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_target.c @ 166] 00d4fb9c 00586187 postgres!transformSelectStmt+0x74 [c:\pginstaller-repo\postgres.windows\src\backend\parser\analyze.c @ 903] 00d4fbb0 00586317 postgres!transformStmt+0xa7 [c:\pginstaller-repo\postgres.windows\src\backend\parser\analyze.c @ 187] 00d4fbc4 0060fc4e postgres!parse_analyze+0x37 [c:\pginstaller-repo\postgres.windows\src\backend\parser\analyze.c @ 97] 00d4fc70 006105e5 postgres!exec_simple_query+0x28e [c:\pginstaller-repo\postgres.windows\src\backend\tcop\postgres.c @ 945] 00d4fcf4 005ca54c postgres!PostgresMain+0x575 [c:\pginstaller-repo\postgres.windows\src\backend\tcop\postgres.c @ 3926] 00d4fd14 005cd27a postgres!BackendRun+0x19c [c:\pginstaller-repo\postgres.windows\src\backend\postmaster\postmaster.c @ 3600] 00d4ff64 00527b1b postgres!SubPostmasterMain+0x30a [c:\pginstaller-repo\postgres.windows\src\backend\postmaster\postmaster.c @ 4096] 00d4ff7c 0071270d postgres!main+0x1fb [c:\pginstaller-repo\postgres.windows\src\backend\main\main.c @ 176] 00d4ffc0 7c817077 postgres!__tmainCRTStartup+0x10f [f:\dd\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 586] 00d4fff0 00000000 kernel32!BaseProcessStart+0x23 1 Id: 4ac.294 Suspend: 1 Teb: 7ffde000 Unfrozen ChildEBP RetAddr 018bfecc 7c90d3aa ntdll!KiFastSystemCallRet 018bfed0 7c8314ae ntdll!ZwFsControlFile+0xc 018bff14 005bd337 kernel32!ConnectNamedPipe+0x52 018bffb4 7c80b729 postgres!pg_signal_thread+0x97 [c:\pginstaller-repo\postgres.windows\src\backend\port\win32\signal.c @ 275] 018bffec 00000000 kernel32!BaseThreadStart+0x37 2 Id: 4ac.528 Suspend: 1 Teb: 7ffdd000 Unfrozen ChildEBP RetAddr 0473ff2c 7c90df5a ntdll!KiFastSystemCallRet 0473ff30 7c8025db ntdll!NtWaitForSingleObject+0xc 0473ff94 005be98b kernel32!WaitForSingleObjectEx+0xa8 0473ffb4 7c80b729 postgres!pg_timer_thread+0x2b [c:\pginstaller-repo\postgres.windows\src\backend\port\win32\timer.c @ 51] 0473ffec 00000000 kernel32!BaseThreadStart+0x37 -- Regards, Richard Broersma Jr.
On 05/10/2011 01:07 PM, Richard Broersma wrote: > 0:003> G > (4ac.158): Access violation - code c0000005 (first chance) > First chance exceptions are reported before any exception handling. > This exception may be expected and handled. > eax=ffffffff ebx=fffffff4 ecx=ffffffff edx=fffffffd esi=00000000 edi=00000000 > eip=0063b890 esp=00d4f614 ebp=00d4f7c0 iopl=0 nv up ei ng nz na pe cy > cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010287 > postgres!datebsearch+0x30: > 0063b890 0fbe0c97 movsx ecx,byte ptr [edi+edx*4] ds:0023:fffffff4=?? > 0:000> ~*k > > . 0 Id: 4ac.158 Suspend: 1 Teb: 7ffdf000 Unfrozen > ChildEBP RetAddr > 00d4f620 0063d738 postgres!datebsearch+0x30 > [c:\pginstaller-repo\postgres.windows\src\backend\utils\adt\datetime.c > @ 3579] > That's interesting...it's going crazy here: 3565 /* datebsearch() 3566 * Binary search -- from Knuth (6.2.1) Algorithm B. Special case like this 3567 * is WAY faster than the generic bsearch(). 3568 */ 3569 static const datetkn * 3570 datebsearch(const char *key, const datetkn *base, int nel) 3571 { 3572 const datetkn *last = base + nel - 1, 3573 *position; 3574 int result; 3575 3576 while (last >= base) 3577 { 3578 position = base + ((last - base) >> 1); 3579 result = key[0] - position->token[0]; So something about that is getting really confused when searching back to negative infinity, but seemingly only on Windows. Thanks for the great detective work help, I'll bounce this over to pgsql-hackers where more people will see it. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
On Tue, May 10, 2011 at 2:41 PM, Greg Smith <greg@2ndquadrant.com> wrote: > So something about that is getting really confused when searching back to > negative infinity, but seemingly only on Windows. > > Thanks for the great detective work help, I'll bounce this over to > pgsql-hackers where more people will see it. No problem. Thanks for the directions on how to help. -- Regards, Richard Broersma Jr.
Inquisition launched at http://archives.postgresql.org/message-id/4DC9B5F2.7030804@2ndQuadrant.com if anyone who isn't subscribed to that heavy mailing list wants to see what happens. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
I'm going on vacation. So I'll be able to respond in about a couple of weeks. On Tue, May 10, 2011 at 4:09 PM, Greg Smith <greg@2ndquadrant.com> wrote: > Inquisition launched at > http://archives.postgresql.org/message-id/4DC9B5F2.7030804@2ndQuadrant.com > if anyone who isn't subscribed to that heavy mailing list wants to see what > happens. > > -- > Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD > > > - > HOWTO Alpha/Beta Test: > http://wiki.postgresql.org/wiki/HowToBetaTest > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-testers > -- Regards, Richard Broersma Jr.
Can either/both of you seeing this crash do the following: show timezone; select * from pg_timezone_abbrevs; And post the result? Tom Lane mentioned there's a similar problem already being chased around, and that data will help figure out how your case fits in relation to it. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
> -----Message d'origine----- > De : pgsql-testers-owner@postgresql.org [mailto:pgsql-testers-> > owner@postgresql.org] De la part de Greg Smith > Envoyé : 11 mai 2011 00:28 > À : Greg Smith > Cc : Richard Broersma; pgsql-testers@postgresql.org > Objet : Re: [TESTERS] 9.1Beta1 - Repeatable Crash on Windows > > Can either/both of you seeing this crash do the following: > > show timezone; > select * from pg_timezone_abbrevs; > > And post the result? Tom Lane mentioned there's a similar problem > already being chased around, and that data will help figure out how your > case fits in relation to it. > > -- > Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD Show timezone gives US/Eastern Select * from pg_timezone_abbrevs returns zero rows -- Mark Watson
On Tue, May 10, 2011 at 9:28 PM, Greg Smith <greg@2ndquadrant.com> wrote: > Can either/both of you seeing this crash do the following: > > show timezone; > select * from pg_timezone_abbrevs; > > And post the result? Here's the results: psql (9.1beta1) WARNING: Console code page (437) differs from Windows code page (1252) 8-bit characters might not work correctly. See psql reference page "Notes for Windows users" for details. Type "help" for help. postgres=# show timezone; TimeZone ------------ US/Pacific (1 row) postgres=# select * from pg_timezone_abbrevs; abbrev | utc_offset | is_dst --------+------------+-------- (0 rows) -- Regards, Richard Broersma Jr.
The initial resolution on all this is that it's another instance of a really tricky bug that should be resolved now. Long description of the problem is at http://archives.postgresql.org/message-id/17311.1305080416@sss.pgh.pa.us Thanks for the report, and if you could check this again on 9.1 Beta 2 after it's released to confirm the fix holds, that would be helpful. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
-----Message d'origine----- > De : pgsql-testers-owner@postgresql.org [mailto:pgsql-testers- > The initial resolution on all this is that it's another instance of a > really tricky bug that should be resolved now. Long description of the > problem is at > http://archives.postgresql.org/message-id/17311.1305080416@sss.pgh.pa.us > > Thanks for the report, and if you could check this again on 9.1 Beta 2 > after it's released to confirm the fix holds, that would be helpful. > > -- > Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD > PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us FYI, this has been resolved with 9.1 Beta 2 on our win32 systems -Mark Watson