Thread: Crash in vacuum analyze

Crash in vacuum analyze

From
Robert L Mathews
Date:
I'm using 7.1.2 on a Red Hat 7.1 box.

I have a database where the backend crashes when I do "vacuum analyze".
It does not happen when I do pg_dump or vacuum, and other databases on
the same box work fine with vacuum analyze. Here's the tail end of the
"vacuum verbose analyze" log:

NOTICE:  Analyzing...
NOTICE:  --Relation document--
NOTICE:  Pages 15: Changed 0, reaped 0, Empty 0, New 0; Tup 79: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 72, MaxLen 2028; Re-using:
Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE:  Index document_pkey: Pages 2; Tuples 79. CPU 0.00s/0.00u sec.
NOTICE:  Index document_unique_region_period: Pages 2; Tuples 79. CPU
0.00s/0.00u sec.
NOTICE:  --Relation pg_toast_28441--
NOTICE:  Pages 59: Changed 0, reaped 1, Empty 0, New 0; Tup 262: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 2, MinLen 76, MaxLen 2034; Re-using:
Free/Avail. Space 4084/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.01u sec.
NOTICE:  Index pg_toast_28441_idx: Pages 2; Tuples 262: Deleted 0. CPU
0.00s/0.00u sec.
NOTICE:  Analyzing...
pqReadData() -- backend closed the channel unexpectedly.
    This probably means the backend terminated abnormally
    before or while processing the request.
connection to server was lost

It happens at the same point each time I try it. I haven't noticed any
particular problems using the database, but this (obviously) worries me.

I've already tried dropping the database, recreating it, and re-importing
it from the pg_dump; the same thing happens.

What else should I try?

--
Robert L Mathews, Tiger Technologies


Re: Crash in vacuum analyze

From
Tom Lane
Date:
Robert L Mathews <lists@tigertech.com> writes:
> I have a database where the backend crashes when I do "vacuum analyze".

What shows up in the postmaster log?  Is a core file produced, and if so
can you provide a stack trace from it?  What is the schema of the table
producing the problem ("document", apparently)?

            regards, tom lane

Re: Crash in vacuum analyze

From
Dave Cramer
Date:
There is a bug in the glibc library that causes this. I think there is
some documentation on the list about it.

Tom?

Dave
On Mon, 2001-09-03 at 17:55, Tom Lane wrote:
> Robert L Mathews <lists@tigertech.com> writes:
> > I have a database where the backend crashes when I do "vacuum analyze".
>
> What shows up in the postmaster log?  Is a core file produced, and if so
> can you provide a stack trace from it?  What is the schema of the table
> producing the problem ("document", apparently)?
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>
>




Re: Crash in vacuum analyze

From
Tom Lane
Date:
Dave Cramer <Dave@micro-automation.net> writes:
> There is a bug in the glibc library that causes this.

Hmm ... he *could* be suffering from that strcoll() bug, but with no
info about his platform I'm hesitant to jump to that conclusion.

            regards, tom lane

Re: Crash in vacuum analyze

From
Robert L Mathews
Date:
At 9/3/01 4:11 PM, Tom Lane wrote:

>Hmm ... he *could* be suffering from that strcoll() bug, but with no
>info about his platform I'm hesitant to jump to that conclusion.

It was indeed the strcoll bug in glibc 2.2.2. The database in question
has some long strings that triggered it.

Upgrading to a later glibc fixed the problem. Thanks for the pointer (and
your time); I appreciate it.

--
Robert L Mathews, Tiger Technologies


Re: Crash in vacuum analyze

From
Sean Chittenden
Date:
> >Hmm ... he *could* be suffering from that strcoll() bug, but with no
> >info about his platform I'm hesitant to jump to that conclusion.
>
> It was indeed the strcoll bug in glibc 2.2.2. The database in question
> has some long strings that triggered it.
>
> Upgrading to a later glibc fixed the problem. Thanks for the pointer (and
> your time); I appreciate it.

        Can we test for this at configure time and spit out a warning
message to the user that they need to upgrade their version of glibc?
-sc

--
Sean Chittenden

Re: Crash in vacuum analyze

From
Tom Lane
Date:
Sean Chittenden <sean-pgsql-general@chittenden.org> writes:
>         Can we test for this at configure time and spit out a warning
> message to the user that they need to upgrade their version of glibc?

I think most of the people who are getting bitten have installed PG from
RPMs, so configure couldn't help them anyway.

Perhaps our future RPMs should have a dependency that requires glibc
version >= first fixed version.

            regards, tom lane

Re: Crash in vacuum analyze

From
Robert L Mathews
Date:
At 9/3/01 6:00 PM, sean-pgsql-general@chittenden.org wrote:

>> Upgrading to a later glibc fixed the problem. Thanks for the pointer (and
>> your time); I appreciate it.
>
>        Can we test for this at configure time and spit out a warning
>message to the user that they need to upgrade their version of glibc?
>-sc

I was using RPMs, so that wouldn't have helped in my case (unless the RPM
script also had such a test).

--
Robert L Mathews, Tiger Technologies


Re: Crash in vacuum analyze

From
Sean Chittenden
Date:
> >         Can we test for this at configure time and spit out a warning
> > message to the user that they need to upgrade their version of glibc?
>
> I think most of the people who are getting bitten have installed PG from
> RPMs, so configure couldn't help them anyway.

::grrr::  I have nothing nice to say about RPM's and the headaches they
have caused me in the past.  Is it late to lobby Linux and/or Red Hat
and ask them for a new package format similar to the ports tree?
::grin:: In any case, Tom, you're right again as always.

> Perhaps our future RPMs should have a dependency that requires glibc
> version >= first fixed version.

    Far beit for me to disagree...  was this bug in only Linux's
glibc?

    -sc

--
Sean Chittenden

Re: Crash in vacuum analyze

From
"Jeff Boes"
Date:
In article <999558317.8648.1.camel@inspiron.cramers>, "Dave Cramer"
<Dave@micro-automation.net> wrote:

> There is a bug in the glibc library that causes this. I think there is
> some documentation on the list about it.

Anybody have a pointer to more info about this?  How do I determine if
this affects my system? (I'm having problems similar to this with VACUUM
ANALYZE on one particular, long-row table.)

--
Jeff Boes                                             vox 616.226.9550
Database Engineer                                     fax 616.349.9076
Nexcerpt, Inc.                                      jboes@nexcerpt.com

Re: Crash in vacuum analyze

From
Robert L Mathews
Date:
At 9/6/01 6:34 PM, Jeff Boes <jboes@nexcerpt.com> wrote:

>In article <999558317.8648.1.camel@inspiron.cramers>, "Dave Cramer"
><Dave@micro-automation.net> wrote:
>
>> There is a bug in the glibc library that causes this. I think there is
>> some documentation on the list about it.
>
>Anybody have a pointer to more info about this?  How do I determine if
>this affects my system? (I'm having problems similar to this with VACUUM
>ANALYZE on one particular, long-row table.)

That sounds like it's probably it (especially if you can do a normal
vacuum with no trouble). If you're using glibc version 2.2.2 or earlier,
your machine is vulnerable.

The solution is to upgrade to glibc 2.2.3 or later.

--
Robert L Mathews, Tiger Technologies


Re: Crash in vacuum analyze

From
Tom Lane
Date:
"Jeff Boes" <jboes@nexcerpt.com> writes:
> In article <999558317.8648.1.camel@inspiron.cramers>, "Dave Cramer"
> <Dave@micro-automation.net> wrote:
>> There is a bug in the glibc library that causes this. I think there is
>> some documentation on the list about it.

> Anybody have a pointer to more info about this?

See the thread starting at

http://fts.postgresql.org/db/mw/msg.html?mid=1021209

Bottom line is that strcoll() is broken in glibc versions before 2.2.3.
If you are running a Postgres installation with locale support compiled
in, then you are vulnerable to this bug.

This may or may not explain your particular problem, of course, but
it's a good thing to check before digging further.

            regards, tom lane