Thread: Re: [GENERAL] 8.2.4 signal 11 with large transaction

Re: [GENERAL] 8.2.4 signal 11 with large transaction

From
Tom Lane
Date:
Bill Moran <wmoran@collaborativefusion.com> writes:
> Oddly, the query succeeds if it's fed into psql.

> I'm now full of mystery and wonder.  It would appear as if the
> underlying problem has something to do with PHP, but why should this
> cause a backend process to crash?

Ah, I see it.  Your PHP script is sending all 30000 INSERT commands
to the backend *in a single PQexec*, ie, one 37MB command string.
psql won't do that, it splits the input at semicolons.

Unsurprisingly, this runs the backend out of memory.  (It's not the
command string that's the problem, so much as the 30000 parse and plan
trees...)

Unfortunately, in trying to prepare the error message, it tries to
attach the command text as the STATEMENT field of the log message.
All 37MB worth.  And of course *that* gets an out-of-memory error.
Presto, infinite recursion, broken only by stack overflow (= SIGSEGV).

It looks like 8.1 and older are also vulnerable to this, it's just that
they don't try to log error statement strings at the default logging
level, whereas 8.2 does.  If you cranked up log_min_error_statement
I think they'd fail too.

I guess what we need to do is hack the emergency-recovery path for
error-during-error-processing such that it will prevent trying to print
a very long debug_query_string.  Maybe we should just not try to print
the command at all in this case, or maybe there's some intermediate
possibility like only printing the first 1K or so.  Thoughts?

            regards, tom lane

Re: [GENERAL] 8.2.4 signal 11 with large transaction

From
"Chris Hoover"
Date:
On 7/20/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I guess what we need to do is hack the emergency-recovery path for
error-during-error-processing such that it will prevent trying to print
a very long debug_query_string.  Maybe we should just not try to print
the command at all in this case, or maybe there's some intermediate
possibility like only printing the first 1K or so.  Thoughts?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


I know my 2 cents are not worth that much, but as a DBA, I would really like for you to print at least some of the string causing the abend.  This would greatly assist in the tracing of the offending query.

Chris


Re: [GENERAL] 8.2.4 signal 11 with large transaction

From
"Sibte Abbas"
Date:
On 7/20/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I guess what we need to do is hack the emergency-recovery path for
> error-during-error-processing such that it will prevent trying to print
> a very long debug_query_string.  Maybe we should just not try to print
> the command at all in this case, or maybe there's some intermediate
> possibility like only printing the first 1K or so.  Thoughts?
>
>                         regards, tom lane
>

I think printing the first 1K would make more sense.

If I understand you correctly, the code path which you are referring
to is the send_message_to_server_log() function in elog.c?

thanks,
--
Sibte Abbas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: [GENERAL] 8.2.4 signal 11 with large transaction

From
Tom Lane
Date:
"Sibte Abbas" <sibtay@gmail.com> writes:
> I think printing the first 1K would make more sense.

> If I understand you correctly, the code path which you are referring
> to is the send_message_to_server_log() function in elog.c?

No, the place that has to change is where errstart() detects that we're
recursing.  We could possibly have it first try to make a shorter string
and only give up entirely if recursion happens again, but given that
this is such a corner case I don't think it's worth the complexity and
risk of further bugs.  I've made it just drop the statement at the same
time that it decides to give up on printing other context (which can
also be a source of out-of-memory problems btw).
http://archives.postgresql.org/pgsql-committers/2007-07/msg00215.php

            regards, tom lane

Re: [GENERAL] 8.2.4 signal 11 with large transaction

From
"Sibte Abbas"
Date:
On 7/23/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> No, the place that has to change is where errstart() detects that we're
> recursing.  We could possibly have it first try to make a shorter string
> and only give up entirely if recursion happens again, but given that
> this is such a corner case I don't think it's worth the complexity and
> risk of further bugs.  I've made it just drop the statement at the same
> time that it decides to give up on printing other context (which can
> also be a source of out-of-memory problems btw).
> http://archives.postgresql.org/pgsql-committers/2007-07/msg00215.php
>

Makes sense.

regards,
--
Sibte Abbas
EnterpriseDB http://www.enterprisedb.com