Thread: 9.1Beta1 - Repeatable Crash on Windows

From:
Richard Broersma
Date:

psql -U postgres
psql (9.1beta1)
WARNING: Console code page (437) differs from Windows code page (1252)
         8-bit characters might not work correctly. See psql reference
         page "Notes for Windows users" for details.
Type "help" for help.

postgres=# SELECT 'INFINITY'::TIMESTAMP;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>


I've also included this report on the google doc.

--
Regards,
Richard Broersma Jr.

From:
Josh Berkus
Date:

On 5/9/11 2:23 PM, Richard Broersma wrote:
> psql -U postgres
> psql (9.1beta1)
> WARNING: Console code page (437) differs from Windows code page (1252)
>          8-bit characters might not work correctly. See psql reference
>          page "Notes for Windows users" for details.
> Type "help" for help.
>
> postgres=# SELECT 'INFINITY'::TIMESTAMP;
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
>
> I've also included this report on the google doc.

Thanks.  Is the PostgreSQL server running on Windows here?


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

From:
Richard Broersma
Date:

Yes.

On Mon, May 9, 2011 at 2:34 PM, Josh Berkus <> wrote:
> On 5/9/11 2:23 PM, Richard Broersma wrote:
>> psql -U postgres
>> psql (9.1beta1)
>> WARNING: Console code page (437) differs from Windows code page (1252)
>>          8-bit characters might not work correctly. See psql reference
>>          page "Notes for Windows users" for details.
>> Type "help" for help.
>>
>> postgres=# SELECT 'INFINITY'::TIMESTAMP;
>> server closed the connection unexpectedly
>>         This probably means the server terminated abnormally
>>         before or while processing the request.
>> The connection to the server was lost. Attempting reset: Failed.
>> !>
>>
>>
>> I've also included this report on the google doc.
>
> Thanks.  Is the PostgreSQL server running on Windows here?
>
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
> -
> HOWTO Alpha/Beta Test:
> http://wiki.postgresql.org/wiki/HowToBetaTest
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-testers
>



--
Regards,
Richard Broersma Jr.

From:
Josh Berkus
Date:

On 5/9/11 2:47 PM, Richard Broersma wrote:
> Yes.

OK, let me see if I can make this happen on Linux or OSX.  Have you tried?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

From:
Richard Broersma
Date:

On Mon, May 9, 2011 at 2:48 PM, Josh Berkus <> wrote:
> On 5/9/11 2:47 PM, Richard Broersma wrote:

> OK, let me see if I can make this happen on Linux or OSX.  Have you tried?

No, I don't have any computer's with these OS's available to me.


--
Regards,
Richard Broersma Jr.

From:
Greg Smith
Date:

On 05/09/2011 05:48 PM, Josh Berkus wrote:
> On 5/9/11 2:47 PM, Richard Broersma wrote:
>
>> Yes.
>>
> OK, let me see if I can make this happen on Linux or OSX.  Have you tried?
>

Works fine here on Linux:

psql (9.1beta1)
Type "help" for help.

gsmith=# SELECT 'INFINITY'::TIMESTAMP;
  timestamp
-----------
  infinity
(1 row)


Richard, anything interesting in the server log after your crash?

--
Greg Smith   2ndQuadrant US       Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



From:
Richard Broersma
Date:

On Mon, May 9, 2011 at 3:53 PM, Greg Smith <> wrote:
> Richard, anything interesting in the server log after your crash?

Here is what the logs show:

2011-05-09 07:49:37 PDT LOG:  server process (PID 2848) was terminated
by exception 0xC0000005
2011-05-09 07:49:37 PDT HINT:  See C include file "ntstatus.h" for a
description of the hexadecimal value.
2011-05-09 07:49:37 PDT LOG:  terminating any other active server processes
2011-05-09 07:49:38 PDT WARNING:  terminating connection because of
crash of another server process
2011-05-09 07:49:38 PDT DETAIL:  The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2011-05-09 07:49:38 PDT HINT:  In a moment you should be able to
reconnect to the database and repeat your command.
2011-05-09 07:49:38 PDT FATAL:  the database system is in recovery mode
2011-05-09 07:49:38 PDT LOG:  all server processes terminated; reinitializing
2011-05-09 07:49:48 PDT FATAL:  pre-existing shared memory block is still in use
2011-05-09 07:49:48 PDT HINT:  Check if there are any old server
processes still running, and terminate them.



--
Regards,
Richard Broersma Jr.

From:
Greg Smith
Date:

Richard Broersma wrote:
> Here is what the logs show:
>
> 2011-05-09 07:49:37 PDT LOG:  server process (PID 2848) was terminated
> by exception 0xC0000005
>

Too bad, that's just a generic "accessed memory you shouldn't have"
exception.  Not much help narrowing down the source.  That could be a
driver or hardware issue, but since you say it's repeatable that seems
less likely.

At this point, fork in the road.  If someone else can reproduce this on
another Windows system, they may be able to run with it.  But if you can
spare some time to dig further, the instructions at
http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows
go over how to trace into where it's actually failing at yourself.  If
you run PostgreSQL on Windows, that's good defensive practice to fit in
on a day it's not an emergency to do so.  (The same is true of any
platform, it just takes more time to setup on Windows)

--
Greg Smith   2ndQuadrant US       Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



From:
"Mark Watson"
Date:

-----Message d'origine-----
De :  [mailto:] De la part de Greg Smith
Envoyé : 9 mai 2011 20:55
À : Richard Broersma
Cc : 
Objet : Re: [TESTERS] 9.1Beta1 - Repeatable Crash on Windows

Richard Broersma wrote:
> Here is what the logs show:
>
> 2011-05-09 07:49:37 PDT LOG:  server process (PID 2848) was terminated
> by exception 0xC0000005
>

Too bad, that's just a generic "accessed memory you shouldn't have"
exception.  Not much help narrowing down the source.  That could be a
driver or hardware issue, but since you say it's repeatable that seems
less likely.

At this point, fork in the road.  If someone else can reproduce this on
another Windows system, they may be able to run with it.  But if you can
spare some time to dig further, the instructions at
http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows
go over how to trace into where it's actually failing at yourself.  If
you run PostgreSQL on Windows, that's good defensive practice to fit in
on a day it's not an emergency to do so.  (The same is true of any
platform, it just takes more time to setup on Windows)
------------------------
I can reproduce the same exception in another manner. I tried to get a stack trace but cannot seem to attach to the
processusing either process explorer or WinDBG. I reproduced this using the current beta pgAdmin by right-clicking on
LoginRoles, select New login role, Role name test, password testp, and selecting all role privliges then OK. The
servicecrashes. PID 5432 below is the pgAdmin process. 
My log file (sorry about the French):

2011-05-10 08:00:26 EDT LOG:  processus serveur (PID 5432) a été arrêté par l'exception 0xC0000005
2011-05-10 08:00:26 EDT ASTUCE :  Voir le fichier d'en-tête C « ntstatus.h » pour une description de la valeur
    hexadécimale.
2011-05-10 08:00:26 EDT LOG:  arrêt des autres processus serveur actifs
2011-05-10 08:00:26 EDT ATTENTION:  arrêt de la connexion à cause de l'arrêt brutal d'un autre processus serveur
2011-05-10 08:00:26 EDT DÉTAIL:  Le postmaster a commandé à ce processus serveur d'annuler la transaction
    courante et de quitter car un autre processus serveur a quitté anormalement
    et qu'il existe probablement de la mémoire partagée corrompue.
2011-05-10 08:00:26 EDT ASTUCE :  Dans un moment, vous devriez être capable de vous reconnecter à la base de
    données et de relancer votre commande.
2011-05-10 08:00:26 EDT ATTENTION:  arrêt de la connexion à cause de l'arrêt brutal d'un autre processus serveur
2011-05-10 08:00:26 EDT DÉTAIL:  Le postmaster a commandé à ce processus serveur d'annuler la transaction
    courante et de quitter car un autre processus serveur a quitté anormalement
    et qu'il existe probablement de la mémoire partagée corrompue.
2011-05-10 08:00:26 EDT ASTUCE :  Dans un moment, vous devriez être capable de vous reconnecter à la base de
    données et de relancer votre commande.
2011-05-10 08:00:26 EDT LOG:  tous les processus serveur se sont arrêtés, réinitialisation
2011-05-10 08:00:36 EDT FATAL:  le bloc de mémoire partagé pré-existant est toujours en cours d'utilisation
2011-05-10 08:00:36 EDT ASTUCE :  Vérifier s'il n'y a pas de vieux processus serveur en cours d'exécution. Si c'est le
    cas, fermez-les.

- Mark Watson


From:
Richard Broersma
Date:

On Mon, May 9, 2011 at 5:55 PM, Greg Smith <> wrote:
> But if you can
> spare some time to dig further, the instructions at
> http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows
> go over how to trace into where it's actually failing at yourself.  If you
> run PostgreSQL on Windows, that's good defensive practice to fit in on a day
> it's not an emergency to do so.  (The same is true of any platform, it just
> takes more time to setup on Windows)

These following is the stack trace taken using the instruction for
"Getting a stack trace of a repeatable backend crash."

0:003> G
(4ac.158): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=ffffffff ebx=fffffff4 ecx=ffffffff edx=fffffffd esi=00000000 edi=00000000
eip=0063b890 esp=00d4f614 ebp=00d4f7c0 iopl=0         nv up ei ng nz na pe cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010287
postgres!datebsearch+0x30:
0063b890 0fbe0c97        movsx   ecx,byte ptr [edi+edx*4]   ds:0023:fffffff4=??
0:000> ~*k

.  0  Id: 4ac.158 Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr
00d4f620 0063d738 postgres!datebsearch+0x30
[c:\pginstaller-repo\postgres.windows\src\backend\utils\adt\datetime.c
@ 3579]
00d4f634 0063e824 postgres!DecodeSpecial+0x38
[c:\pginstaller-repo\postgres.windows\src\backend\utils\adt\datetime.c
@ 2789]
00d4f688 006a62a9 postgres!DecodeDateTime+0x654
[c:\pginstaller-repo\postgres.windows\src\backend\utils\adt\datetime.c
@ 1173]
00d4f818 006df4ce postgres!timestamp_in+0x79
[c:\pginstaller-repo\postgres.windows\src\backend\utils\adt\timestamp.c
@ 161]
00d4fa3c 006e0fc3 postgres!InputFunctionCall+0xae
[c:\pginstaller-repo\postgres.windows\src\backend\utils\fmgr\fmgr.c @
1909]
00d4fa7c 005b66f7 postgres!OidInputFunctionCall+0x33
[c:\pginstaller-repo\postgres.windows\src\backend\utils\fmgr\fmgr.c @
2041]
00d4fa98 005a4d3d postgres!stringTypeDatum+0x27
[c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_type.c
@ 586]
00d4fad4 005a437e postgres!coerce_type+0x16d
[c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_coerce.c
@ 253]
00d4fb08 005a915a postgres!coerce_to_target_type+0x4e
[c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_coerce.c
@ 90]
00d4fb40 005aaa96 postgres!transformTypeCast+0x6a
[c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_expr.c
@ 2105]
00d4fb58 005b5abc postgres!transformExpr+0x326
[c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_expr.c
@ 187]
00d4fb7c 00585ac4 postgres!transformTargetList+0xdc
[c:\pginstaller-repo\postgres.windows\src\backend\parser\parse_target.c
@ 166]
00d4fb9c 00586187 postgres!transformSelectStmt+0x74
[c:\pginstaller-repo\postgres.windows\src\backend\parser\analyze.c @
903]
00d4fbb0 00586317 postgres!transformStmt+0xa7
[c:\pginstaller-repo\postgres.windows\src\backend\parser\analyze.c @
187]
00d4fbc4 0060fc4e postgres!parse_analyze+0x37
[c:\pginstaller-repo\postgres.windows\src\backend\parser\analyze.c @
97]
00d4fc70 006105e5 postgres!exec_simple_query+0x28e
[c:\pginstaller-repo\postgres.windows\src\backend\tcop\postgres.c @
945]
00d4fcf4 005ca54c postgres!PostgresMain+0x575
[c:\pginstaller-repo\postgres.windows\src\backend\tcop\postgres.c @
3926]
00d4fd14 005cd27a postgres!BackendRun+0x19c
[c:\pginstaller-repo\postgres.windows\src\backend\postmaster\postmaster.c
@ 3600]
00d4ff64 00527b1b postgres!SubPostmasterMain+0x30a
[c:\pginstaller-repo\postgres.windows\src\backend\postmaster\postmaster.c
@ 4096]
00d4ff7c 0071270d postgres!main+0x1fb
[c:\pginstaller-repo\postgres.windows\src\backend\main\main.c @ 176]
00d4ffc0 7c817077 postgres!__tmainCRTStartup+0x10f
[f:\dd\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 586]
00d4fff0 00000000 kernel32!BaseProcessStart+0x23

   1  Id: 4ac.294 Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr
018bfecc 7c90d3aa ntdll!KiFastSystemCallRet
018bfed0 7c8314ae ntdll!ZwFsControlFile+0xc
018bff14 005bd337 kernel32!ConnectNamedPipe+0x52
018bffb4 7c80b729 postgres!pg_signal_thread+0x97
[c:\pginstaller-repo\postgres.windows\src\backend\port\win32\signal.c
@ 275]
018bffec 00000000 kernel32!BaseThreadStart+0x37

   2  Id: 4ac.528 Suspend: 1 Teb: 7ffdd000 Unfrozen
ChildEBP RetAddr
0473ff2c 7c90df5a ntdll!KiFastSystemCallRet
0473ff30 7c8025db ntdll!NtWaitForSingleObject+0xc
0473ff94 005be98b kernel32!WaitForSingleObjectEx+0xa8
0473ffb4 7c80b729 postgres!pg_timer_thread+0x2b
[c:\pginstaller-repo\postgres.windows\src\backend\port\win32\timer.c @
51]
0473ffec 00000000 kernel32!BaseThreadStart+0x37

--
Regards,
Richard Broersma Jr.

From:
Greg Smith
Date:

On 05/10/2011 01:07 PM, Richard Broersma wrote:
> 0:003> G
> (4ac.158): Access violation - code c0000005 (first chance)
> First chance exceptions are reported before any exception handling.
> This exception may be expected and handled.
> eax=ffffffff ebx=fffffff4 ecx=ffffffff edx=fffffffd esi=00000000 edi=00000000
> eip=0063b890 esp=00d4f614 ebp=00d4f7c0 iopl=0         nv up ei ng nz na pe cy
> cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010287
> postgres!datebsearch+0x30:
> 0063b890 0fbe0c97        movsx   ecx,byte ptr [edi+edx*4]   ds:0023:fffffff4=??
> 0:000>  ~*k
>
> .  0  Id: 4ac.158 Suspend: 1 Teb: 7ffdf000 Unfrozen
> ChildEBP RetAddr
> 00d4f620 0063d738 postgres!datebsearch+0x30
> [c:\pginstaller-repo\postgres.windows\src\backend\utils\adt\datetime.c
> @ 3579]
>

That's interesting...it's going crazy here:

    3565 /* datebsearch()
    3566  * Binary search -- from Knuth (6.2.1) Algorithm B.  Special
case like this
    3567  * is WAY faster than the generic bsearch().
    3568  */
    3569 static const datetkn *
    3570 datebsearch(const char *key, const datetkn *base, int nel)
    3571 {
    3572     const datetkn *last = base + nel - 1,
    3573                *position;
    3574     int         result;
    3575
    3576     while (last >= base)
    3577     {
    3578         position = base + ((last - base) >> 1);
    3579         result = key[0] - position->token[0];

So something about that is getting really confused when searching back
to negative infinity, but seemingly only on Windows.

Thanks for the great detective work help, I'll bounce this over to
pgsql-hackers where more people will see it.

--
Greg Smith   2ndQuadrant US       Baltimore, MD



From:
Richard Broersma
Date:

On Tue, May 10, 2011 at 2:41 PM, Greg Smith <> wrote:
> So something about that is getting really confused when searching back to
> negative infinity, but seemingly only on Windows.
>
> Thanks for the great detective work help, I'll bounce this over to
> pgsql-hackers where more people will see it.

No problem.  Thanks for the directions on how to help.


--
Regards,
Richard Broersma Jr.

From:
Greg Smith
Date:

Inquisition launched at
http://archives.postgresql.org/message-id/
if anyone who isn't subscribed to that heavy mailing list wants to see
what happens.

--
Greg Smith   2ndQuadrant US       Baltimore, MD



From:
Richard Broersma
Date:

I'm going on vacation.  So I'll be able to respond in about a couple of weeks.

On Tue, May 10, 2011 at 4:09 PM, Greg Smith <> wrote:
> Inquisition launched at
> http://archives.postgresql.org/message-id/
> if anyone who isn't subscribed to that heavy mailing list wants to see what
> happens.
>
> --
> Greg Smith   2ndQuadrant US       Baltimore, MD
>
>
> -
> HOWTO Alpha/Beta Test:
> http://wiki.postgresql.org/wiki/HowToBetaTest
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-testers
>



--
Regards,
Richard Broersma Jr.

From:
Greg Smith
Date:

Can either/both of you seeing this crash do the following:

show timezone;
select * from pg_timezone_abbrevs;

And post the result?  Tom Lane mentioned there's a similar problem
already being chased around, and that data will help figure out how your
case fits in relation to it.

--
Greg Smith   2ndQuadrant US       Baltimore, MD



From:
"Mark Watson"
Date:

> -----Message d'origine-----
> De :  [mailto:pgsql-testers-> >
] De la part de Greg Smith
> Envoyé : 11 mai 2011 00:28
> À : Greg Smith
> Cc : Richard Broersma; 
> Objet : Re: [TESTERS] 9.1Beta1 - Repeatable Crash on Windows
>
> Can either/both of you seeing this crash do the following:
>
> show timezone;
> select * from pg_timezone_abbrevs;
>
> And post the result?  Tom Lane mentioned there's a similar problem
> already being chased around, and that data will help figure out how your
> case fits in relation to it.
>
> --
> Greg Smith   2ndQuadrant US       Baltimore, MD

Show timezone gives US/Eastern
Select * from pg_timezone_abbrevs returns zero rows
--
Mark Watson


From:
Richard Broersma
Date:

On Tue, May 10, 2011 at 9:28 PM, Greg Smith <> wrote:
> Can either/both of you seeing this crash do the following:
>
> show timezone;
> select * from pg_timezone_abbrevs;
>
> And post the result?

Here's the results:

psql (9.1beta1)
WARNING: Console code page (437) differs from Windows code page (1252)
         8-bit characters might not work correctly. See psql reference
         page "Notes for Windows users" for details.
Type "help" for help.

postgres=# show timezone;
  TimeZone
------------
 US/Pacific
(1 row)


postgres=# select * from pg_timezone_abbrevs;
 abbrev | utc_offset | is_dst
--------+------------+--------
(0 rows)

--
Regards,
Richard Broersma Jr.

From:
Greg Smith
Date:

The initial resolution on all this is that it's another instance of a
really tricky bug that should be resolved now.  Long description of the
problem is at
http://archives.postgresql.org/message-id/

Thanks for the report, and if you could check this again on 9.1 Beta 2
after it's released to confirm the fix holds, that would be helpful.

--
Greg Smith   2ndQuadrant US       Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



From:
"Mark Watson"
Date:

-----Message d'origine-----
> De :  [mailto:pgsql-testers-
> The initial resolution on all this is that it's another instance of a
> really tricky bug that should be resolved now.  Long description of the
> problem is at
> http://archives.postgresql.org/message-id/
>
> Thanks for the report, and if you could check this again on 9.1 Beta 2
> after it's released to confirm the fix holds, that would be helpful.
>
> --
> Greg Smith   2ndQuadrant US       Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us

FYI, this has been resolved with 9.1 Beta 2 on our win32 systems
-Mark Watson