Thread: Unresolved error 0xC0000409 on Windows Server
Hello,<br /><br />I am encountering an error on my Postgres installation for Windows Server 64-bit. The error was posted<a href="http://archives.postgresql.org/pgsql-bugs/2012-09/msg00014.php">here</a> a couple months ago; however, nosolution was found on the pgsql-bugs list, so I am reposting it to pgsql-hackers in the hopes that someone will be ableto help. My error message is identical to the one previously posted:<br /><br />2012-11-01 22:36:26 EDT LOG: 00000:server process (PID 7060) was terminated by exception 0xC0000409<br />2012-11-01 22:36:26 EDT DETAIL: Failed processwas running: INSERT INTO <b>[snipped SQL command]</b><br />2012-11-01 22:36:26 EDT HINT: See C include file "ntstatus.h"for a description of the hexadecimal value.<br /> 2012-11-01 22:36:26 EDT LOCATION: LogChildExit, src\backend\postmaster\postmaster.c:2884<br/>2012-11-01 22:36:26 EDT LOG: 00000: terminating any other active server processes<br/>2012-11-01 22:36:26 EDT LOCATION: HandleChildCrash, src\backend\postmaster\postmaster.c:2682<br /> 2012-11-0122:36:26 EDT WARNING: 57P00: terminating connection because of crash of another server process<br />2012-11-0122:36:26 EDT DETAIL: The postmaster has commanded this server process to roll back the current transaction andexit, because another server process exited abnormally and possibly corrupted shared memory.<br /> 2012-11-01 22:36:26EDT HINT: In a moment you should be able to reconnect to the database and repeat your command.<br />2012-11-01 22:36:26EDT LOCATION: quickdie, src\backend\tcop\postgres.c:2556<br />2012-11-01 22:36:26 EDT LOG: 00000: all server processesterminated; reinitializing<br /> 2012-11-01 22:36:26 EDT LOCATION: PostmasterStateMachine, src\backend\postmaster\postmaster.c:3135<br/>2012-11-01 22:36:36 EDT FATAL: XX000: pre-existing shared memory block is stillin use<br />2012-11-01 22:36:36 EDT HINT: Check if there are any old server processes still running, and terminatethem.<br /> 2012-11-01 22:36:36 EDT LOCATION: PGSharedMemoryCreate, src\backend\port\win32_shmem.c:194<br /><br/>The error happens regularly while performing database INSERTS. The [snipped SQL command] part above contains the INSERTcommand that was executing when the server crashed. After restarting the server the command executes fine, so it'snot a problem with the command. I installed Postgres from the standard Windows binary "postgresql-9.2.1-1-windows-x64.exe"and I have not changed any configuration settings from their default values.<br /><br/>Does anyone know what might be happening and how I might fix it?<br /><br />Thanks,<br />Matt<br />
On Fri, Nov 2, 2012 at 1:25 PM, Matthew Gerber <gerber.matthew@gmail.com> wrote: > Hello, > > I am encountering an error on my Postgres installation for Windows Server > 64-bit. The error was posted here a couple months ago; however, no solution > was found on the pgsql-bugs list, so I am reposting it to pgsql-hackers in > the hopes that someone will be able to help. My error message is identical > to the one previously posted: > > 2012-11-01 22:36:26 EDT LOG: 00000: server process (PID 7060) was > terminated by exception 0xC0000409 > 2012-11-01 22:36:26 EDT DETAIL: Failed process was running: INSERT INTO > [snipped SQL command] > 2012-11-01 22:36:26 EDT HINT: See C include file "ntstatus.h" for a > description of the hexadecimal value. > 2012-11-01 22:36:26 EDT LOCATION: LogChildExit, > src\backend\postmaster\postmaster.c:2884 > 2012-11-01 22:36:26 EDT LOG: 00000: terminating any other active server > processes > 2012-11-01 22:36:26 EDT LOCATION: HandleChildCrash, > src\backend\postmaster\postmaster.c:2682 > 2012-11-01 22:36:26 EDT WARNING: 57P00: terminating connection because of > crash of another server process > 2012-11-01 22:36:26 EDT DETAIL: The postmaster has commanded this server > process to roll back the current transaction and exit, because another > server process exited abnormally and possibly corrupted shared memory. > 2012-11-01 22:36:26 EDT HINT: In a moment you should be able to reconnect > to the database and repeat your command. > 2012-11-01 22:36:26 EDT LOCATION: quickdie, > src\backend\tcop\postgres.c:2556 > 2012-11-01 22:36:26 EDT LOG: 00000: all server processes terminated; > reinitializing > 2012-11-01 22:36:26 EDT LOCATION: PostmasterStateMachine, > src\backend\postmaster\postmaster.c:3135 > 2012-11-01 22:36:36 EDT FATAL: XX000: pre-existing shared memory block is > still in use > 2012-11-01 22:36:36 EDT HINT: Check if there are any old server processes > still running, and terminate them. > 2012-11-01 22:36:36 EDT LOCATION: PGSharedMemoryCreate, > src\backend\port\win32_shmem.c:194 > > The error happens regularly while performing database INSERTS. The [snipped > SQL command] part above contains the INSERT command that was executing when > the server crashed. After restarting the server the command executes fine, > so it's not a problem with the command. I installed Postgres from the > standard Windows binary "postgresql-9.2.1-1-windows-x64.exe" and I have not > changed any configuration settings from their default values. > > Does anyone know what might be happening and how I might fix it? hm, several times over the last couple of months (both on postgres 9.1 and 9.2), i've seen a similar crash, but on linux. It hits the log like this: Execution halted (~ 200x) Error: segfault from C stack overflow Execution halted (~ 30x) LOG: server process (PID 19882) was terminated by signal 11: Segmentation fault LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. note the lack of LOG in 'Execution halted', etc. This has happened several times, on different servers using different workloads (but always under load). As of yet, I've been unable to get a core but I hope to get one next time it happens. I wonder if it's a similar cause? One thing I've been tempted to try is raising max_stack_depth to see if that helps the problem. merlin
On Fri, Nov 02, 2012 at 02:05:47PM -0500, Merlin Moncure wrote: > On Fri, Nov 2, 2012 at 1:25 PM, Matthew Gerber <gerber.matthew@gmail.com> wrote: > > I am encountering an error on my Postgres installation for Windows Server > > 64-bit. The error was posted here a couple months ago; however, no solution > > was found on the pgsql-bugs list, so I am reposting it to pgsql-hackers in > > the hopes that someone will be able to help. My error message is identical > > to the one previously posted: > > > > 2012-11-01 22:36:26 EDT LOG: 00000: server process (PID 7060) was > > terminated by exception 0xC0000409 > > 2012-11-01 22:36:26 EDT DETAIL: Failed process was running: INSERT INTO > > [snipped SQL command] Could you post an anonymized query, post an anonymized query plan, and/or describe the general nature of the query? Does it call functions? About how many rows does it insert? What server settings have you customized? https://wiki.postgresql.org/wiki/Server_Configuration If you could get a stack trace or minidump, that would be most helpful: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows Magnus's questions for the reporter of bug #7517 are relevant, too. Does the system have any antivirus software installed? > > 2012-11-01 22:36:26 EDT LOG: 00000: all server processes terminated; > > reinitializing > > 2012-11-01 22:36:26 EDT LOCATION: PostmasterStateMachine, > > src\backend\postmaster\postmaster.c:3135 > > 2012-11-01 22:36:36 EDT FATAL: XX000: pre-existing shared memory block is > > still in use > > 2012-11-01 22:36:36 EDT HINT: Check if there are any old server processes > > still running, and terminate them. > > 2012-11-01 22:36:36 EDT LOCATION: PGSharedMemoryCreate, > > src\backend\port\win32_shmem.c:194 This part smells like a bug in its own right. > hm, several times over the last couple of months (both on postgres 9.1 > and 9.2), i've seen a similar crash, but on linux. It hits the log > like this: > > Execution halted (~ 200x) > Error: segfault from C stack overflow > Execution halted (~ 30x) > LOG: server process (PID 19882) was terminated by signal 11: Segmentation fault > LOG: terminating any other active server processes > note the lack of LOG in 'Execution halted', etc. This has happened > several times, on different servers using different workloads (but > always under load). As of yet, I've been unable to get a core but I > hope to get one next time it happens. I wonder if it's a similar > cause? Google suggests those unadorned messages originate in R. Do the affected systems use PL/R? If so ... > One thing I've been tempted to try is raising max_stack_depth to see > if that helps the problem. ... that probably won't help. Depending on the specifics of the situation, *lowering* max_stack_depth might tend to give you an ERROR instead of a crash. Or it might just give R a bit more stack space to devour before reaching the same crash it would otherwise reach. Thanks, nm
On Fri, Nov 2, 2012 at 9:00 PM, Noah Misch <noah@leadboat.com> wrote:
Here is the command that was executing when the 0xC0000409 exception was raised:
INSERT INTO places (bounding_box,country,full_name,id,name,type,url)
VALUES
(st_transform_null(ST_GeometryFromText('POLYGON((-97.034085 32.771786,-97.034085 32.953966,-96.888789 32.953966,-96.888789 32.771786,-97.034085 32.771786))',4326),26918),'United States','Irving, TX','dce44ec49eb788f5','Irving','city','http://api.twitter.com/1/geo/id/dce44ec49eb788f5.json'),
(st_transform_null(ST_GeometryFromText('POLYGON((107.610398 -6.9006302,107.610398 -6.864448,107.637222 -6.864448,107.637222 -6.9006302,107.610398 -6.9006302))',4326),26918),'Indonesia','Coblong, Kota Bandung','2c0294c5eab821c9','Coblong','city','http://api.twitter.com/1/geo/id/2c0294c5eab821c9.json')
The st_transform_null function is simply a wrapper around the PostGIS st_transform function that deals with NULL values. The other fields are all VARCHARs. This insert is only adding two values. In general, the insert commands I'm running insert anywhere up to 100 rows each, so they're not huge.
I haven't customized any settings.
So I attached the VS debugger, but the server died without raising an exception in VS. Not sure what's going on here.
Testing disabling the AV software now. Will post back.
Nope.
Really appreciate any help you can provide.
Matt
On Fri, Nov 02, 2012 at 02:05:47PM -0500, Merlin Moncure wrote:
> On Fri, Nov 2, 2012 at 1:25 PM, Matthew Gerber <gerber.matthew@gmail.com> wrote:> > I am encountering an error on my Postgres installation for Windows ServerCould you post an anonymized query, post an anonymized query plan, and/or
> > 64-bit. The error was posted here a couple months ago; however, no solution
> > was found on the pgsql-bugs list, so I am reposting it to pgsql-hackers in
> > the hopes that someone will be able to help. My error message is identical
> > to the one previously posted:
> >
> > 2012-11-01 22:36:26 EDT LOG: 00000: server process (PID 7060) was
> > terminated by exception 0xC0000409
> > 2012-11-01 22:36:26 EDT DETAIL: Failed process was running: INSERT INTO
> > [snipped SQL command]
describe the general nature of the query? Does it call functions? About how
many rows does it insert?
Here is the command that was executing when the 0xC0000409 exception was raised:
INSERT INTO places (bounding_box,country,full_name,id,name,type,url)
VALUES
(st_transform_null(ST_GeometryFromText('POLYGON((-97.034085 32.771786,-97.034085 32.953966,-96.888789 32.953966,-96.888789 32.771786,-97.034085 32.771786))',4326),26918),'United States','Irving, TX','dce44ec49eb788f5','Irving','city','http://api.twitter.com/1/geo/id/dce44ec49eb788f5.json'),
(st_transform_null(ST_GeometryFromText('POLYGON((107.610398 -6.9006302,107.610398 -6.864448,107.637222 -6.864448,107.637222 -6.9006302,107.610398 -6.9006302))',4326),26918),'Indonesia','Coblong, Kota Bandung','2c0294c5eab821c9','Coblong','city','http://api.twitter.com/1/geo/id/2c0294c5eab821c9.json')
The st_transform_null function is simply a wrapper around the PostGIS st_transform function that deals with NULL values. The other fields are all VARCHARs. This insert is only adding two values. In general, the insert commands I'm running insert anywhere up to 100 rows each, so they're not huge.
What server settings have you customized?
https://wiki.postgresql.org/wiki/Server_Configuration
I haven't customized any settings.
If you could get a stack trace or minidump, that would be most helpful:
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows
So I attached the VS debugger, but the server died without raising an exception in VS. Not sure what's going on here.
Magnus's questions for the reporter of bug #7517 are relevant, too. Does the
system have any antivirus software installed?
Testing disabling the AV software now. Will post back.
Google suggests those unadorned messages originate in R. Do the affected
systems use PL/R? If so ...
Nope.
Really appreciate any help you can provide.
Matt
On 11/04/2012 08:47 AM, Matthew Gerber wrote:
Here is the command that was executing when the 0xC0000409 exception was raised:
INSERT INTO places (bounding_box,country,full_name,id,name,type,url)
VALUES
(st_transform_null(ST_GeometryFromText('POLYGON((-97.034085 32.771786,-97.034085 32.953966,-96.888789 32.953966,-96.888789 32.771786,-97.034085 32.771786))',4326),26918),'United States','Irving, TX','dce44ec49eb788f5','Irving','city','http://api.twitter.com/1/geo/id/dce44ec49eb788f5.json'),
OK, so you're using PostGIS. What other extensions are loaded? PL/R? Any other PLs?
Can you show the definition of the table `places`, incuding any associated triggers, etc? Use `\d+` in psql for the table def'n.
Please also post the query plan. http://explain.depesz.com/ is useful for this.
Try creating a directory called "crashdumps" in the data directory, at the same level as "pg_xlog" and "pg_clog" etc. Give the "postgresql" user the "full control" permission on it. Then run the test again.
So I attached the VS debugger, but the server died without raising an exception in VS. Not sure what's going on here.
Do any minidump files appear in the directory? If so, you can examine them with windbg or Visual Studio to see where the crash happened.
--
Craig Ringer
On Sun, Nov 4, 2012 at 3:39 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
PostGIS is the only extension that I added to the default configuration. I didn't change anything else.
Here the definition of the places table:
twitter=# \d+ places
Table "public.places"
Column | Type | Modifiers | Storage | Stats target |Description
--------------+-------------------------+-----------+----------+--------------+-------------
bounding_box | geometry(Polygon,26918) | | main | |
country | character varying | | extended | |
full_name | character varying | | extended | |
id | character varying | not null | extended | |
name | character varying | | extended | |
type | character varying | | extended | |
url | character varying | | extended | |
Indexes:
"places_pkey" PRIMARY KEY, btree (id)
"places_bounding_box_idx" gist (bounding_box)
"places_type_idx" btree (type)
Referenced by:
TABLE "tweets" CONSTRAINT "tweets_place_id_fkey" FOREIGN KEY (place_id) REFERENCES places(id) ON DELETE CASCADE
Has OIDs: no
Here is the query plan:
QUERY PLAN
Insert on public.places (cost=0.00..0.01 rows=1 width=0) (actual time=1.000..1.000 rows=0 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
Output: '01030000202669000001000000050000005E8705A4E32C38C1EE553AE6E95E
4D41086A91990B1B38C11620AF9784874D41FCA5741676E437C19436654287814D41C43E11332BF6
37C17C863746F0584D415E8705A4E32C38C1EE553AE6E95E4D41'::geometry(Polygon,26918),
'United States'::character varying, 'Irving, TX'::character varying, 'dce44ec49e
b788f5'::character varying, 'Irving'::character varying, 'city'::character varyi
ng, 'http://api.twitter.com/1/geo/id/dce44ec49eb788f5.json'::character varying
Total runtime: 1.157 ms
(4 rows)
Running it now.
Will try it.
Thanks for your help so far, guys. Hopefully we get somewhere on this...
Matt
OK, so you're using PostGIS. What other extensions are loaded? PL/R? Any other PLs?On 11/04/2012 08:47 AM, Matthew Gerber wrote:Here is the command that was executing when the 0xC0000409 exception was raised:
INSERT INTO places (bounding_box,country,full_name,id,name,type,url)
VALUES
(st_transform_null(ST_GeometryFromText('POLYGON((-97.034085 32.771786,-97.034085 32.953966,-96.888789 32.953966,-96.888789 32.771786,-97.034085 32.771786))',4326),26918),'United States','Irving, TX','dce44ec49eb788f5','Irving','city','http://api.twitter.com/1/geo/id/dce44ec49eb788f5.json'),
PostGIS is the only extension that I added to the default configuration. I didn't change anything else.
Can you show the definition of the table `places`, incuding any associated triggers, etc? Use `\d+` in psql for the table def'n.
Here the definition of the places table:
twitter=# \d+ places
Table "public.places"
Column | Type | Modifiers | Storage | Stats target |Description
--------------+-------------------------+-----------+----------+--------------+-------------
bounding_box | geometry(Polygon,26918) | | main | |
country | character varying | | extended | |
full_name | character varying | | extended | |
id | character varying | not null | extended | |
name | character varying | | extended | |
type | character varying | | extended | |
url | character varying | | extended | |
Indexes:
"places_pkey" PRIMARY KEY, btree (id)
"places_bounding_box_idx" gist (bounding_box)
"places_type_idx" btree (type)
Referenced by:
TABLE "tweets" CONSTRAINT "tweets_place_id_fkey" FOREIGN KEY (place_id) REFERENCES places(id) ON DELETE CASCADE
Has OIDs: no
Please also post the query plan. http://explain.depesz.com/ is useful for this.
Here is the query plan:
QUERY PLAN
Insert on public.places (cost=0.00..0.01 rows=1 width=0) (actual time=1.000..1.000 rows=0 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
Output: '01030000202669000001000000050000005E8705A4E32C38C1EE553AE6E95E
4D41086A91990B1B38C11620AF9784874D41FCA5741676E437C19436654287814D41C43E11332BF6
37C17C863746F0584D415E8705A4E32C38C1EE553AE6E95E4D41'::geometry(Polygon,26918),
'United States'::character varying, 'Irving, TX'::character varying, 'dce44ec49e
b788f5'::character varying, 'Irving'::character varying, 'city'::character varyi
ng, 'http://api.twitter.com/1/geo/id/dce44ec49eb788f5.json'::character varying
Total runtime: 1.157 ms
(4 rows)
Try creating a directory called "crashdumps" in the data directory, at the same level as "pg_xlog" and "pg_clog" etc. Give the "postgresql" user the "full control" permission on it. Then run the test again.
So I attached the VS debugger, but the server died without raising an exception in VS. Not sure what's going on here.
Running it now.
Do any minidump files appear in the directory? If so, you can examine them with windbg or Visual Studio to see where the crash happened.
Will try it.
Thanks for your help so far, guys. Hopefully we get somewhere on this...
Matt
Matthew Gerber <gerber.matthew@gmail.com> writes: >> Here is the command that was executing when the 0xC0000409 exception was >> raised: >> INSERT INTO places (bounding_box,country,full_name,id,name,type,url) >> VALUES >> (st_transform_null(ST_GeometryFromText('POLYGON((-97.034085 >> 32.771786,-97.034085 32.953966,-96.888789 32.953966,-96.888789 >> 32.771786,-97.034085 32.771786))',4326),26918),'United States','Irving, >> TX','dce44ec49eb788f5','Irving','city',' >> http://api.twitter.com/1/geo/id/dce44ec49eb788f5.json'), Assuming that 0xC0000409 does actually represent a stack-overrun condition, it doesn't seem like there are very many ways that could happen on such a simple command. The best theory that is coming to mind is that you hit some corner case in the GiST code that is causing the index-entry-insertion attempt to recurse infinitely, or at least enough times to hit the stack limit. (I exclude the theory of infinite recursion in the btree indexes only on the grounds that those are so much better tested than GiST that the idea seems improbable.) It's not clear yet whether the bug is in the generic GiST code or in the PostGIS-provided index operator class. If that is the explanation, then hitting the crash would likely depend not only on the specific bounding_box value being inserted, but also on the previous state of the places_bounding_box_idx index, which could make it darn hard to reproduce. If you can't easily create a reproducer script, I think we'll have to ask you for a stack trace from the crash. See https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend regards, tom lane
On Sun, Nov 4, 2012 at 3:39 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
I did this but nothing appears in crashdumps after the server crashes. The latest test I did included the addition of this directory and the disabling of my antivirus software. Nothing seems to have changed. Following Tom's suggestion, I'll try to get a stack trace again (last time didn't produce anything).
The only other thing I've noticed is that the crash always occurs when inserting into the "places" table (definition in previous email), even though there are two other tables that are also receiving inserts. This is odd to me. Any thoughts?
Matt
On 11/04/2012 08:47 AM, Matthew Gerber wrote:Try creating a directory called "crashdumps" in the data directory, at the same level as "pg_xlog" and "pg_clog" etc. Give the "postgresql" user the "full control" permission on it. Then run the test again.
So I attached the VS debugger, but the server died without raising an exception in VS. Not sure what's going on here.
Do any minidump files appear in the directory? If so, you can examine them with windbg or Visual Studio to see where the crash happened.
I did this but nothing appears in crashdumps after the server crashes. The latest test I did included the addition of this directory and the disabling of my antivirus software. Nothing seems to have changed. Following Tom's suggestion, I'll try to get a stack trace again (last time didn't produce anything).
The only other thing I've noticed is that the crash always occurs when inserting into the "places" table (definition in previous email), even though there are two other tables that are also receiving inserts. This is odd to me. Any thoughts?
Matt
On Fri, Nov 2, 2012 at 8:00 PM, Noah Misch <noah@leadboat.com> wrote: >> hm, several times over the last couple of months (both on postgres 9.1 >> and 9.2), i've seen a similar crash, but on linux. It hits the log >> like this: >> >> Execution halted (~ 200x) >> Error: segfault from C stack overflow >> Execution halted (~ 30x) >> LOG: server process (PID 19882) was terminated by signal 11: Segmentation fault >> LOG: terminating any other active server processes > >> note the lack of LOG in 'Execution halted', etc. This has happened >> several times, on different servers using different workloads (but >> always under load). As of yet, I've been unable to get a core but I >> hope to get one next time it happens. I wonder if it's a similar >> cause? > > Google suggests those unadorned messages originate in R. Do the affected > systems use PL/R? If so ... yes -- they do. I was pretty certain that no R code was running at the time of the crash but not 100% sure. That's a big clue -- thanks.Investigating... merlin
On Mon, Nov 05, 2012 at 09:10:30AM -0500, Matthew Gerber wrote: > On Sun, Nov 4, 2012 at 3:39 AM, Craig Ringer <ringerc@ringerc.id.au> wrote: > > On 11/04/2012 08:47 AM, Matthew Gerber wrote: > > So I attached the VS debugger, but the server died without raising an > > exception in VS. Not sure what's going on here. Not sure either. I attached WinDbg to a client backend and directed that backend to call a function written to trigger the same exception. It caught the exception and reported a credible stack trace. I get the same outcome using Visual Studio Express 2012 for Windows Desktop. > >> Try creating a directory called "crashdumps" in the data directory, at > > the same level as "pg_xlog" and "pg_clog" etc. Give the "postgresql" user > > the "full control" permission on it. Then run the test again. > > > > Do any minidump files appear in the directory? If so, you can examine them > > with windbg or Visual Studio to see where the crash happened. > > I did this but nothing appears in crashdumps after the server crashes. The > latest test I did included the addition of this directory and the disabling > of my antivirus software. Nothing seems to have changed. Following Tom's > suggestion, I'll try to get a stack trace again (last time didn't produce > anything). I now see that this exception cannot yield a minidump; the CRT restores the default handler before raising it. Since this exception is intended to avert a security exposure, perhaps Microsoft reasoned that allowing application code to regain control would dilute that benefit. That choice is certainly inconvenient for us, though. > The only other thing I've noticed is that the crash always occurs when > inserting into the "places" table (definition in previous email), even > though there are two other tables that are also receiving inserts. This is > odd to me. Any thoughts? That's not intrinsically surprising unless, say, the tables have the same structure and receive the same data. Thanks, nm
On Sun, Nov 04, 2012 at 02:30:38PM -0500, Tom Lane wrote: > Matthew Gerber <gerber.matthew@gmail.com> writes: > >> Here is the command that was executing when the 0xC0000409 exception was > >> raised: > >> INSERT INTO places (bounding_box,country,full_name,id,name,type,url) > >> VALUES > >> (st_transform_null(ST_GeometryFromText('POLYGON((-97.034085 > >> 32.771786,-97.034085 32.953966,-96.888789 32.953966,-96.888789 > >> 32.771786,-97.034085 32.771786))',4326),26918),'United States','Irving, > >> TX','dce44ec49eb788f5','Irving','city',' > >> http://api.twitter.com/1/geo/id/dce44ec49eb788f5.json'), > > Assuming that 0xC0000409 does actually represent a stack-overrun > condition, It signifies scribbling past the end of the frame's local variables: http://msdn.microsoft.com/en-us/library/8dbf701c.aspx
On Sun, Nov 11, 2012 at 12:23 AM, Noah Misch <noah@leadboat.com> wrote:
A slight update on this: previously, my insert code involved a long "SELECT ... UNION ALL ... SELECT ... UNION ALL ..." command. If this command was too long, I would get a stack depth exception thrown back to my client application. I reduced the length of the command to eliminate the client-side exceptions, but on some occasions I would still get the 0xC0000409 crash on the server side. I have eliminated the long "SELECT ... UNION ALL ... " command, and I now get no errors on either side. So it seems like long commands like this were actually causing the server-side crashes. The proper behavior would seem to be throwing the exception back to the client application instead of crashing the server.
Hope this helps...
Matt
On Sun, Nov 04, 2012 at 02:30:38PM -0500, Tom Lane wrote:
> Matthew Gerber <gerber.matthew@gmail.com> writes:
> >> Here is the command that was executing when the 0xC0000409 exception was
> >> raised:
> >> INSERT INTO places (bounding_box,country,full_name,id,name,type,url)
> >> VALUES
> >> (st_transform_null(ST_GeometryFromText('POLYGON((-97.034085
> >> 32.771786,-97.034085 32.953966,-96.888789 32.953966,-96.888789
> >> 32.771786,-97.034085 32.771786))',4326),26918),'United States','Irving,
> >> TX','dce44ec49eb788f5','Irving','city','
> >> http://api.twitter.com/1/geo/id/dce44ec49eb788f5.json'),
>
> Assuming that 0xC0000409 does actually represent a stack-overrun
> condition,
It signifies scribbling past the end of the frame's local variables:
http://msdn.microsoft.com/en-us/library/8dbf701c.aspx
A slight update on this: previously, my insert code involved a long "SELECT ... UNION ALL ... SELECT ... UNION ALL ..." command. If this command was too long, I would get a stack depth exception thrown back to my client application. I reduced the length of the command to eliminate the client-side exceptions, but on some occasions I would still get the 0xC0000409 crash on the server side. I have eliminated the long "SELECT ... UNION ALL ... " command, and I now get no errors on either side. So it seems like long commands like this were actually causing the server-side crashes. The proper behavior would seem to be throwing the exception back to the client application instead of crashing the server.
Hope this helps...
Matt
Matthew Gerber <gerber.matthew@gmail.com> writes: > On Sun, Nov 11, 2012 at 12:23 AM, Noah Misch <noah@leadboat.com> wrote: >> It signifies scribbling past the end of the frame's local variables: >> http://msdn.microsoft.com/en-us/library/8dbf701c.aspx > A slight update on this: previously, my insert code involved a long > "SELECT ... UNION ALL ... SELECT ... UNION ALL ..." command. How long is "long"? regards, tom lane
On Sun, Nov 11, 2012 at 11:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I was seeing queries with around 5000-7000 "UNION ALL" statements.
Matt
Matthew Gerber <gerber.matthew@gmail.com> writes:
> On Sun, Nov 11, 2012 at 12:23 AM, Noah Misch <noah@leadboat.com> wrote:>> It signifies scribbling past the end of the frame's local variables:How long is "long"?
>> http://msdn.microsoft.com/en-us/library/8dbf701c.aspx
> A slight update on this: previously, my insert code involved a long
> "SELECT ... UNION ALL ... SELECT ... UNION ALL ..." command.
I was seeing queries with around 5000-7000 "UNION ALL" statements.
Matt
Matthew Gerber <gerber.matthew@gmail.com> writes: > On Sun, Nov 11, 2012 at 11:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> How long is "long"? > I was seeing queries with around 5000-7000 "UNION ALL" statements. Hm. I experimented with test queries created like so: perl -e 'print "SELECT 1 a, 2 b, 3 c\n"; print "UNION ALL SELECT 1 a, 2 b, 3 c\n" foreach (1..8200);' | psql On the machine I tried this on, it works up to about 8200 and then fails in the way I'd expect: ERROR: stack depth limit exceeded HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depthlimit is adequate. But then when I cranked it up to 80000, kaboom: connection to server was lost Inspection of the core dump shows transformSetOperationTree is the problem --- it's recursing but lacks a check_stack_depth test. So that's easy to fix, but I wonder why the critical depth limit seems to be so much less on your machine. I get the expected error up to about 65000 UNION ALLs --- why is yours crashing at a tenth of that? regards, tom lane
On Sun, Nov 11, 2012 at 12:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Tom,
Interesting. I really have no idea why mine seemed to fail so much sooner. I recalled my 5k-7k figure from memory, so I might be off on that, but probably not by an order of magnitude. In any case, it sounds like you know how to fix the problem. Should I file this as a bug report or will you take care of it from here?
Best,
Matt
Matthew Gerber <gerber.matthew@gmail.com> writes:
> On Sun, Nov 11, 2012 at 11:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:>> How long is "long"?Hm. I experimented with test queries created like so:
> I was seeing queries with around 5000-7000 "UNION ALL" statements.
perl -e 'print "SELECT 1 a, 2 b, 3 c\n"; print "UNION ALL SELECT 1 a, 2 b, 3 c\n" foreach (1..8200);' | psql
On the machine I tried this on, it works up to about 8200 and then fails
in the way I'd expect:
ERROR: stack depth limit exceeded
HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depth limit is adequate.
But then when I cranked it up to 80000, kaboom:
connection to server was lost
Inspection of the core dump shows transformSetOperationTree is the
problem --- it's recursing but lacks a check_stack_depth test.
So that's easy to fix, but I wonder why the critical depth limit seems
to be so much less on your machine. I get the expected error up to
about 65000 UNION ALLs --- why is yours crashing at a tenth of that?
Tom,
Interesting. I really have no idea why mine seemed to fail so much sooner. I recalled my 5k-7k figure from memory, so I might be off on that, but probably not by an order of magnitude. In any case, it sounds like you know how to fix the problem. Should I file this as a bug report or will you take care of it from here?
Best,
Matt
Matthew Gerber <gerber.matthew@gmail.com> writes: > Interesting. I really have no idea why mine seemed to fail so much sooner. > I recalled my 5k-7k figure from memory, so I might be off on that, but > probably not by an order of magnitude. In any case, it sounds like you know > how to fix the problem. Should I file this as a bug report or will you take > care of it from here? I will fix the crash I'm seeing; I'm just not 100% convinced it's the same crash you're seeing. It'd be useful if some other folk would poke at similar examples on Windows to see if there's something unexpected happening with stack depth checks there. regards, tom lane
On Sun, Nov 11, 2012 at 12:22:24PM -0500, Tom Lane wrote: > perl -e 'print "SELECT 1 a, 2 b, 3 c\n"; print "UNION ALL SELECT 1 a, 2 b, 3 c\n" foreach (1..8200);' | psql > > On the machine I tried this on, it works up to about 8200 and then fails > in the way I'd expect: > > ERROR: stack depth limit exceeded > HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depthlimit is adequate. > > But then when I cranked it up to 80000, kaboom: > > connection to server was lost I tried this test case on Windows Server 2008 (x64). It hit max_stack_depth at 9000 UNIONs and crashed at 10000. When I run it under a debugger, the debugger reports exception 0xC00000FD (STATUS_STACK_OVERFLOW). Run normally, the server log reports exception 0xC0000005 (STATUS_ACCESS_VIOLATION). > Inspection of the core dump shows transformSetOperationTree is the > problem --- it's recursing but lacks a check_stack_depth test. > So that's easy to fix, but I wonder why the critical depth limit seems > to be so much less on your machine. I get the expected error up to > about 65000 UNION ALLs --- why is yours crashing at a tenth of that? So, I can reproduce the lower threshold, but the exception type does not agree with the one Matthew observed.
On Sun, Nov 11, 2012 at 10:10:31AM -0500, Matthew Gerber wrote: > > > Matthew Gerber <gerber.matthew@gmail.com> writes: > > > >> Here is the command that was executing when the 0xC0000409 exception > > was > > > >> raised: > > > >> INSERT INTO places (bounding_box,country,full_name,id,name,type,url) > > > >> VALUES > > > >> (st_transform_null(ST_GeometryFromText('POLYGON((-97.034085 > > > >> 32.771786,-97.034085 32.953966,-96.888789 32.953966,-96.888789 > > > >> 32.771786,-97.034085 32.771786))',4326),26918),'United > > States','Irving, > > > >> TX','dce44ec49eb788f5','Irving','city',' > > > >> http://api.twitter.com/1/geo/id/dce44ec49eb788f5.json'), > A slight update on this: previously, my insert code involved a long > "SELECT ... UNION ALL ... SELECT ... UNION ALL ..." command. If this > command was too long, I would get a stack depth exception thrown back to my > client application. I reduced the length of the command to eliminate the > client-side exceptions, but on some occasions I would still get the > 0xC0000409 crash on the server side. I have eliminated the long "SELECT ... > UNION ALL ... " command, and I now get no errors on either side. So it > seems like long commands like this were actually causing the server-side > crashes. The proper behavior would seem to be throwing the exception back > to the client application instead of crashing the server. Above, you quoted an INSERT ... VALUES of two rows. Have you observed an exception-0xC0000409 crash with an INSERT ... VALUES query, or only with an "INSERT ... SELECT ... thousands of UNION" query?
On Sun, Nov 11, 2012 at 2:43 PM, Noah Misch <noah@leadboat.com> wrote:
Every time the server crashed with 0xC0000409, the log reported that it was running the simple INSERT command (two rows) that I started this thread with. However, this didn't make any sense to me given the simplicity of the INSERT command and the fact that the error indicated a stack overflow. So I removed the long "SELECT ... UNION ALL ..." command since it seemed more relevant to the error, and the process has been running continuously for a few days now.
To answer your question directly: I was seeing the server crash when using the simple INSERT and long "SELECT ... UNION ..." (these commands are issued independently at different points in the program). Now my program is only using the simple INSERT, and the crashes are gone.
Hope this helps...
Matt
On Sun, Nov 11, 2012 at 10:10:31AM -0500, Matthew Gerber wrote:
> > > Matthew Gerber <gerber.matthew@gmail.com> writes:
> > > >> Here is the command that was executing when the 0xC0000409 exception
> > was
> > > >> raised:
> > > >> INSERT INTO places (bounding_box,country,full_name,id,name,type,url)
> > > >> VALUES
> > > >> (st_transform_null(ST_GeometryFromText('POLYGON((-97.034085
> > > >> 32.771786,-97.034085 32.953966,-96.888789 32.953966,-96.888789
> > > >> 32.771786,-97.034085 32.771786))',4326),26918),'United
> > States','Irving,
> > > >> TX','dce44ec49eb788f5','Irving','city','
> > > >> http://api.twitter.com/1/geo/id/dce44ec49eb788f5.json'),> A slight update on this: previously, my insert code involved a longAbove, you quoted an INSERT ... VALUES of two rows. Have you observed an
> "SELECT ... UNION ALL ... SELECT ... UNION ALL ..." command. If this
> command was too long, I would get a stack depth exception thrown back to my
> client application. I reduced the length of the command to eliminate the
> client-side exceptions, but on some occasions I would still get the
> 0xC0000409 crash on the server side. I have eliminated the long "SELECT ...
> UNION ALL ... " command, and I now get no errors on either side. So it
> seems like long commands like this were actually causing the server-side
> crashes. The proper behavior would seem to be throwing the exception back
> to the client application instead of crashing the server.
exception-0xC0000409 crash with an INSERT ... VALUES query, or only with an
"INSERT ... SELECT ... thousands of UNION" query?
Every time the server crashed with 0xC0000409, the log reported that it was running the simple INSERT command (two rows) that I started this thread with. However, this didn't make any sense to me given the simplicity of the INSERT command and the fact that the error indicated a stack overflow. So I removed the long "SELECT ... UNION ALL ..." command since it seemed more relevant to the error, and the process has been running continuously for a few days now.
To answer your question directly: I was seeing the server crash when using the simple INSERT and long "SELECT ... UNION ..." (these commands are issued independently at different points in the program). Now my program is only using the simple INSERT, and the crashes are gone.
Hope this helps...
Matt
Noah Misch <noah@leadboat.com> writes: > So, I can reproduce the lower threshold, but the exception type does not agree > with the one Matthew observed. I finally got around to looking at the link you provided about error 0xC0000409, and realized that I'd been completely confusing it with stack overflow --- but actually, it's a report that something scribbled past the end of a finite-size local-variable array. So I now think that Matthew's stumbled across two completely independent bugs, and we've fixed only one of them. The 0xC0000409 error is something else, and possibly a lot worse since it could conceivably be a security issue. It still seems likely that the actual location of the bug is either in PostGIS or in the GIST index code, but without the ability to reproduce the failure it's awfully hard to find it. Matthew, could you try a bit harder to find a self-contained test case that produces that error? regards, tom lane
On Sun, Nov 11, 2012 at 8:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Sure, it might take me a while to find time but I'll keep it on my list.
Matt
Noah Misch <noah@leadboat.com> writes:I finally got around to looking at the link you provided about error
> So, I can reproduce the lower threshold, but the exception type does not agree
> with the one Matthew observed.
0xC0000409, and realized that I'd been completely confusing it with
stack overflow --- but actually, it's a report that something scribbled
past the end of a finite-size local-variable array. So I now think that
Matthew's stumbled across two completely independent bugs, and we've
fixed only one of them. The 0xC0000409 error is something else, and
possibly a lot worse since it could conceivably be a security issue.
It still seems likely that the actual location of the bug is either
in PostGIS or in the GIST index code, but without the ability to
reproduce the failure it's awfully hard to find it. Matthew, could
you try a bit harder to find a self-contained test case that produces
that error?
regards, tom lane
Sure, it might take me a while to find time but I'll keep it on my list.
Matt
All,
I have successfully isolated this error and created a simple SQL script to reproduce it. Just to recap - this script will cause a server crash with exception 0xC0000409 as described in previous emails. The crux of the problem seems to be my creation / use of the function st_transform_null. My intent with this function is to wrap the st_transform function provided by PostGIS, but account for the situation where the argument to be transformed is NULL. In this situation, st_transform throws an internal_error, which my function catches and returns NULL for. The error / crash is not caused by a NULL argument; rather, it is caused by the final value in the attached script's INSERT statement, which contains a lat/lon pair that is beyond PostGIS's range. I'm not questioning whether this value is actually outside the legal range, but I do not think such an input should cause the server to crash completely.
Here are the steps to reproduce the crash:
1) Create a new instance of a 9.2 server (Windows 64-bit), and a new database (call it test) with the PostGIS extension.
2) Run the script:
psql -U postgres -d test -f C:\server_crash.sql
You should see the following:
psql:C:/server_crash.sql:31: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql:C:/server_crash.sql:31: connection to server was lost
3) Check your log for the error.
I hope this helps. It took me quite a while to track down the problem so I hope someone can figure out what is going on under the hood. It seems to be a pretty significant problem.
Cheers,
Matt
I have successfully isolated this error and created a simple SQL script to reproduce it. Just to recap - this script will cause a server crash with exception 0xC0000409 as described in previous emails. The crux of the problem seems to be my creation / use of the function st_transform_null. My intent with this function is to wrap the st_transform function provided by PostGIS, but account for the situation where the argument to be transformed is NULL. In this situation, st_transform throws an internal_error, which my function catches and returns NULL for. The error / crash is not caused by a NULL argument; rather, it is caused by the final value in the attached script's INSERT statement, which contains a lat/lon pair that is beyond PostGIS's range. I'm not questioning whether this value is actually outside the legal range, but I do not think such an input should cause the server to crash completely.
Here are the steps to reproduce the crash:
1) Create a new instance of a 9.2 server (Windows 64-bit), and a new database (call it test) with the PostGIS extension.
2) Run the script:
psql -U postgres -d test -f C:\server_crash.sql
You should see the following:
psql:C:/server_crash.sql:31: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql:C:/server_crash.sql:31: connection to server was lost
3) Check your log for the error.
I hope this helps. It took me quite a while to track down the problem so I hope someone can figure out what is going on under the hood. It seems to be a pretty significant problem.
Cheers,
Matt
On Sun, Nov 11, 2012 at 9:45 PM, Matthew Gerber <gerber.matthew@gmail.com> wrote:
On Sun, Nov 11, 2012 at 8:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Noah Misch <noah@leadboat.com> writes:I finally got around to looking at the link you provided about error
> So, I can reproduce the lower threshold, but the exception type does not agree
> with the one Matthew observed.
0xC0000409, and realized that I'd been completely confusing it with
stack overflow --- but actually, it's a report that something scribbled
past the end of a finite-size local-variable array. So I now think that
Matthew's stumbled across two completely independent bugs, and we've
fixed only one of them. The 0xC0000409 error is something else, and
possibly a lot worse since it could conceivably be a security issue.
It still seems likely that the actual location of the bug is either
in PostGIS or in the GIST index code, but without the ability to
reproduce the failure it's awfully hard to find it. Matthew, could
you try a bit harder to find a self-contained test case that produces
that error?
regards, tom lane
Sure, it might take me a while to find time but I'll keep it on my list.
Matt
Attachment
On Sun, Dec 09, 2012 at 02:09:21PM -0500, Matthew Gerber wrote: > In this situation, st_transform throws an internal_error, which my > function catches and returns NULL for. The error / crash is not caused by a > NULL argument; rather, it is caused by the final value in the attached > script's INSERT statement, which contains a lat/lon pair that is beyond > PostGIS's range. I'm not questioning whether this value is actually outside > the legal range, but I do not think such an input should cause the server > to crash completely. The server should not crash, no. However, the facts that PostGIS reported an internal error and the crash is responsive to your choice of geographic inputs increases the chance that the problem lies in PostGIS code, not PostgreSQL core code. > Here are the steps to reproduce the crash: > > 1) Create a new instance of a 9.2 server (Windows 64-bit), and a new > database (call it test) with the PostGIS extension. > > 2) Run the script: > > psql -U postgres -d test -f C:\server_crash.sql > > You should see the following: > > psql:C:/server_crash.sql:31: server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > psql:C:/server_crash.sql:31: connection to server was lost > > 3) Check your log for the error. I tried this test procedure, but I could not reproduce the crash. PostgreSQL: one click installer postgresql-9.2.2-1-windows-x64.exe; PostGIS: v2.0.1 from Stack Builder; OS: Windows Server 2008 R2 Datacenter. I needed the workaround in the last comment of this[1] bug report to get a working installation. At that point, your test procedure completes without error. What specific versions are involved in your installation? Could you try again to get a minidump and stack trace? Connect to your test database with psql; run "SELECT pg_backend_pid();"; open Visual Studio; select Debug -> Attach to process...; select the postgres.exe process with matching ID. Run your test case; when the exception window pops up, select "Break". If the stack trace does not contain full symbol information, right click on some of the incomplete lines and select Load Symbols From -> Symbol Path; navigate to the location of postgres.pdb. You can select "Save Dump As..." from the Debug menu to create the minidump. Thanks, nm [1] http://trac.osgeo.org/postgis/ticket/1824