Thread: 60 GB of error log.

60 GB of error log.

From

Erwin Brandstetter

Date:

20 May 2010, 00:26:45

Aloha!

I encountered a serious problem involving pgAdmin3 1.10.3 today. But I
am not entirely sure, what is to blame. Maybe one of you can make
anything of it?

I gave a application-developer access to one of our databases (postgres
8.4.3 on Debian Lenny) so we could discuss the design of a new
application in light of the existing data. Naturally I suggested pgAdmin
1.10.3 for viewing data and structure.
He downloaded and installed the Mac edition (German version) and
connected via TCP/IP to standard port 5432, with "SSL require". He tried
a couple of things, the last thing he did was to experiment with the
"Graphical Query Builder". He mentioned that he stopped a few queries
that took too long. I saw fitting error messages in the log. But nothing
unusual. Then he carelessly left his notebook running over night,
leaving pgAdmin open, or at least, that's what he swears is all he did.

It appears, that by a glitch in the SSL-handshake a loop was triggered
that led to millions of error messages in the database log. The day
after I had to deal with 60 GB (!) of log. I am not sure about the role
of pgAdmin, the GQB or the SSL protocol (including possible quirks of
the server?) Fact is, it almost stalled our productive database server
by eating up all available disk space. This is the most serious incident
involving pgAdmin I have ever experienced, yet. (Yeah, I need a strategy
to prevent that from happening in general, I have learned that lesson.)

I attached a snippet of the log. Maybe it tells one of you guys anything?
Locale is German, I added some English translations.

Regards
Erwin

Attachment

demo.log

Re: 60 GB of error log.

From

Dave Page

Date:

20 May 2010, 03:21:12

2010/5/19 Erwin Brandstetter <brandstetter@falter.at>:
> Aloha!
>
> I encountered a serious problem involving pgAdmin3 1.10.3 today. But I am
> not entirely sure, what is to blame. Maybe one of you can make anything of
> it?
>
> I gave a application-developer access to one of our databases (postgres
> 8.4.3 on Debian Lenny) so we could discuss the design of a new application
> in light of the existing data. Naturally I suggested pgAdmin 1.10.3 for
> viewing data and structure.
> He downloaded and installed the Mac edition (German version) and connected
> via TCP/IP to standard port 5432, with "SSL require". He tried a couple of
> things, the last thing he did was to experiment with the "Graphical Query
> Builder". He mentioned that he stopped a few queries that took too long. I
> saw fitting error messages in the log. But nothing unusual. Then he
> carelessly left his notebook running over night, leaving pgAdmin open, or at
> least, that's what he swears is all he did.
>
> It appears, that by a glitch in the SSL-handshake a loop was triggered that
> led to millions of error messages in the database log. The day after I had
> to deal with 60 GB (!) of log. I am not sure about the role of pgAdmin, the
> GQB or the SSL protocol (including possible quirks of the server?) Fact is,
> it almost stalled our productive database server by eating up all available
> disk space. This is the most serious incident involving pgAdmin I have ever
> experienced, yet. (Yeah, I need a strategy to prevent that from happening in
> general, I have learned that lesson.)
>
> I attached a snippet of the log. Maybe it tells one of you guys anything?
> Locale is German, I added some English translations.

pgAdmin doesn't really handle any of the SSL stuff itself, except to
turn it on or off, so we may need to pass this upstream. I wonder
though, if you've been hit by a variant of this:
http://archives.postgresql.org/pgsql-hackers/2010-02/msg00198.php

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: 60 GB of error log.

From

Erwin Brandstetter

Date:

20 May 2010, 17:46:32

On 20.05.2010 05:21, dpage@pgadmin.org wrote:
> 2010/5/19 Erwin Brandstetter<brandstetter@falter.at>:
>
>> Aloha!
>>
>> I encountered a serious problem involving pgAdmin3 1.10.3 today. But I am
>> not entirely sure, what is to blame. Maybe one of you can make anything of
>> it?
>>
>> I gave a application-developer access to one of our databases (postgres
>> 8.4.3 on Debian Lenny) so we could discuss the design of a new application
>> in light of the existing data. Naturally I suggested pgAdmin 1.10.3 for
>> viewing data and structure.
>> He downloaded and installed the Mac edition (German version) and connected
>> via TCP/IP to standard port 5432, with "SSL require". He tried a couple of
>> things, the last thing he did was to experiment with the "Graphical Query
>> Builder". He mentioned that he stopped a few queries that took too long. I
>> saw fitting error messages in the log. But nothing unusual. Then he
>> carelessly left his notebook running over night, leaving pgAdmin open, or at
>> least, that's what he swears is all he did.
>>
>> It appears, that by a glitch in the SSL-handshake a loop was triggered that
>> led to millions of error messages in the database log. The day after I had
>> to deal with 60 GB (!) of log. I am not sure about the role of pgAdmin, the
>> GQB or the SSL protocol (including possible quirks of the server?) Fact is,
>> it almost stalled our productive database server by eating up all available
>> disk space. This is the most serious incident involving pgAdmin I have ever
>> experienced, yet. (Yeah, I need a strategy to prevent that from happening in
>> general, I have learned that lesson.)
>>
>> I attached a snippet of the log. Maybe it tells one of you guys anything?
>> Locale is German, I added some English translations.
>>
> pgAdmin doesn't really handle any of the SSL stuff itself, except to
> turn it on or off, so we may need to pass this upstream. I wonder
> though, if you've been hit by a variant of this:
> http://archives.postgresql.org/pgsql-hackers/2010-02/msg00198.php
>

Thanks, Dave! I have dug through the thread and it certainly looks to be
the root of  the problem.
The question remains what triggered the endless loop. Judging from the
log, pgAdmin must have kept sending the same query over and over - after
the "ssl handshake failure". Why did it not stop?
Or am I misinterpreting the procedure? Could it have been a local loop
between postmaster and the SSL library on the server? Each attempt on
sending the error message triggered the next error? That would be a
serious problem of the server that is not by itself covered in the
thread on pgsql-hackers ...

Regards
Erwin

Re: 60 GB of error log.

From

Dave Page

Date:

20 May 2010, 18:36:44

On Thu, May 20, 2010 at 1:46 PM, Erwin Brandstetter
<brandstetter@falter.at> wrote:
> Thanks, Dave! I have dug through the thread and it certainly looks to be the
> root of  the problem.
> The question remains what triggered the endless loop. Judging from the log,
> pgAdmin must have kept sending the same query over and over - after the "ssl
> handshake failure". Why did it not stop?
> Or am I misinterpreting the procedure? Could it have been a local loop
> between postmaster and the SSL library on the server? Each attempt on
> sending the error message triggered the next error? That would be a serious
> problem of the server that is not by itself covered in the thread on
> pgsql-hackers ...

I can't see any reason why pgAdmin would re-run the query unless your
cat was sitting on F5. I suspect SSL...


--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company