Thread: segfault of autovacuum process during restore - coredumps included

segfault of autovacuum process during restore - coredumps included

From
Frank van Vugt
Date:
L.S.

Since I started to use v8.1 I've been seeing incidental segfaults of the=20
autovacuum process, probably when it is kicking in at some particular point=
=20
during the pg_restore of a database:

<-2005-11-28 09:39:12 CET>LOG:  autovacuum process (PID 5075) was terminate=
d=20
by signal 11
<-2005-11-28 09:39:12 CET>LOG:  terminating any other active server process=
es

and

<-2005-11-28 10:09:21 CET>LOG:  autovacuum process (PID 5230) was terminate=
d=20
by signal 11
<-2005-11-28 10:09:21 CET>LOG:  terminating any other active server process=
es



Triggering it twice this morning annoyed me enough to recompile with debug=
=20
info/asserts. During a number of (20-30) additional 'tries'  and I was able=
=20
to trigger the problem twice, resulting  the following coredumps:

#0  0x082535fc in CopySnapshot (snapshot=3D0x0) at tqual.c:1301
1301            newsnap =3D (Snapshot) palloc(sizeof(SnapshotData) +
(gdb) where
#0  0x082535fc in CopySnapshot (snapshot=3D0x0) at tqual.c:1301
#1  0x0814673b in fmgr_sql (fcinfo=3D0xbfe9d740) at functions.c:319
#2  0x081400bc in ExecMakeFunctionResult (fcache=3D0x8517ab8,=20
econtext=3D0x8517f10, isNull=3D0xbfe9d9bb "=EF=BF=BD231=3D\b", isDone=3D0x0=
) at=20
execQual.c:1096
#3  0x081426d6 in ExecEvalExprSwitchContext (expression=3D0x8519b04,=20
econtext=3D0x0, isNull=3D0xbfe9d9bb "=EF=BF=BD231=3D\b", isDone=3D0x0) at e=
xecQual.c:2865
#4  0x08189e53 in evaluate_expr (expr=3D0x8517ab8, result_type=3D23) at=20
clauses.c:2646
#5  0x0818b8e9 in simplify_function (funcid=3D163843, result_type=3D23,=20
args=3D0x85023c0, allow_inline=3D1 '\001', context=3D0xbfe9dbd0) at clauses=
.c:2260
#6  0x0818bdea in eval_const_expressions_mutator (node=3D0x8502144,=20
context=3D0xbfe9dbd0) at clauses.c:1305
#7  0x0818a5bd in expression_tree_mutator (node=3D0x85015a0, mutator=3D0x81=
8bc10=20
<eval_const_expressions_mutator>, context=3D0xbfe9dbd0) at clauses.c:3473
#8  0x0818be4d in eval_const_expressions_mutator (node=3D0x8501694,=20
context=3D0xbfe9dbd0) at clauses.c:1335
#9  0x0818c431 in eval_const_expressions_mutator (node=3D0x8502304,=20
context=3D0xbfe9dbd0) at clauses.c:2030
#10 0x0818ca35 in eval_const_expressions (node=3D0x85016c0) at clauses.c:12=
11
#11 0x0822fd63 in RelationGetIndexPredicate (relation=3D0x443361b8) at=20
relcache.c:2790
#12 0x080cb49f in BuildIndexInfo (index=3D0x443361b8) at index.c:900
#13 0x080febd6 in analyze_rel (relid=3D164956, vacstmt=3D0x44217764) at=20
analyze.c:257
#14 0x0813728b in vacuum (vacstmt=3D0x44217764, relids=3D0x44219fb8) at=20
vacuum.c:476
#15 0x0819350f in autovacuum_do_vac_analyze (relids=3D0x4421b394, dovacuum=
=3D0=20
'\0', doanalyze=3D1 '\001', freeze=3D0 '\0') at autovacuum.c:900
#16 0x08193ead in AutoVacMain (argc=3D0, argv=3D0x0) at autovacuum.c:674
#17 0x081941f6 in autovac_start () at autovacuum.c:170
#18 0x08199f91 in ServerLoop () at postmaster.c:1268
#19 0x0819b122 in PostmasterMain (argc=3D3, argv=3D0x833b8b8) at postmaster=
.c:943
#20 0x0815bece in main (argc=3D3, argv=3D0x833b8b8) at main.c:256

and

#0  0x082535fc in CopySnapshot (snapshot=3D0x0) at tqual.c:1301
1301            newsnap =3D (Snapshot) palloc(sizeof(SnapshotData) +
(gdb) bt
#0  0x082535fc in CopySnapshot (snapshot=3D0x0) at tqual.c:1301
#1  0x0814673b in fmgr_sql (fcinfo=3D0xbfd97090) at functions.c:319
#2  0x081400bc in ExecMakeFunctionResult (fcache=3D0x8567a50,=20
econtext=3D0x8567ea8, isNull=3D0xbfd9730b "", isDone=3D0x0) at execQual.c:1=
096
#3  0x081426d6 in ExecEvalExprSwitchContext (expression=3D0x8569a9c,=20
econtext=3D0x0, isNull=3D0xbfd9730b "", isDone=3D0x0) at execQual.c:2865
#4  0x08189e53 in evaluate_expr (expr=3D0x8567a50, result_type=3D23) at=20
clauses.c:2646
#5  0x0818b8e9 in simplify_function (funcid=3D226356, result_type=3D23,=20
args=3D0x8552278, allow_inline=3D1 '\001', context=3D0xbfd97520) at clauses=
.c:2260
#6  0x0818bdea in eval_const_expressions_mutator (node=3D0x8551ffc,=20
context=3D0xbfd97520) at clauses.c:1305
#7  0x0818a5bd in expression_tree_mutator (node=3D0x853d450, mutator=3D0x81=
8bc10=20
<eval_const_expressions_mutator>, context=3D0xbfd97520) at clauses.c:3473
#8  0x0818be4d in eval_const_expressions_mutator (node=3D0x853d544,=20
context=3D0xbfd97520) at clauses.c:1335
#9  0x0818c431 in eval_const_expressions_mutator (node=3D0x85521bc,=20
context=3D0xbfd97520) at clauses.c:2030
#10 0x0818ca35 in eval_const_expressions (node=3D0x853d570) at clauses.c:12=
11
#11 0x0822fd63 in RelationGetIndexPredicate (relation=3D0x44377818) at=20
relcache.c:2790
#12 0x080cb49f in BuildIndexInfo (index=3D0x44377818) at index.c:900
#13 0x080febd6 in analyze_rel (relid=3D227469, vacstmt=3D0x44258764) at=20
analyze.c:257
#14 0x0813728b in vacuum (vacstmt=3D0x44258764, relids=3D0x4425afb8) at=20
vacuum.c:476
#15 0x0819350f in autovacuum_do_vac_analyze (relids=3D0x4425c394, dovacuum=
=3D0=20
'\0', doanalyze=3D1 '\001', freeze=3D0 '\0') at autovacuum.c:900
#16 0x08193ead in AutoVacMain (argc=3D0, argv=3D0x0) at autovacuum.c:674
#17 0x081941f6 in autovac_start () at autovacuum.c:170
#18 0x08199f91 in ServerLoop () at postmaster.c:1268
#19 0x0819b122 in PostmasterMain (argc=3D3, argv=3D0x833b8b8) at postmaster=
.c:943
#20 0x0815bece in main (argc=3D3, argv=3D0x833b8b8) at main.c:256


Both dumps are still available for additional info on variable-values, etc.



db=3D# select version();
                                version
------------------------------------------------------------------------
 PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.3





--=20
Best,




Frank.

Re: segfault of autovacuum process during restore - coredumps included

From
Alvaro Herrera
Date:
Frank van Vugt wrote:
> L.S.
>
> Since I started to use v8.1 I've been seeing incidental segfaults of the
> autovacuum process, probably when it is kicking in at some particular point
> during the pg_restore of a database:

The problem appears to be that autovacuum is forgetting to set a
snapshot in cases where it's needed.  My bug -- should be easy to fix.
I'll apply a patch shortly.

Thanks for the report!

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: segfault of autovacuum process during restore - coredumps included

From
Alvaro Herrera
Date:
Frank van Vugt wrote:

> Since I started to use v8.1 I've been seeing incidental segfaults of the
> autovacuum process, probably when it is kicking in at some particular point
> during the pg_restore of a database:

Hum, I'm unable to reproduce the problem here; could you give me the
code for the functions in the functional indexes for tables 164956 or
227469?  They must be non-trivial SQL functions AFAICT (I'm rather
unable to figure out their shape based only on the backtrace.)

The attached patch should correct the problem, but I'd like to make sure
it does ...

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachment

Re: segfault of autovacuum process during restore - coredumps included

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Frank van Vugt wrote:
>
> > Since I started to use v8.1 I've been seeing incidental segfaults of the
> > autovacuum process, probably when it is kicking in at some particular point
> > during the pg_restore of a database:

[...]

> The attached patch should correct the problem, but I'd like to make sure
> it does ...

I managed to make it fail, and the patch does indeed correct the
problem.  I'm applying to both HEAD and the 8.1 branch.  If you could
confirm that it fixes the problem for you, that'd be good.

Thanks,

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: segfault of autovacuum process during restore - coredumps included

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> The attached patch should correct the problem, but I'd like to make sure
> it does ...

Rather than that, I'd suggest just setting ActiveSnapshot
unconditionally after each of the StartTransactionCommand calls in
autovacuum.c, ie make the code look just like vacuum.c:

    /* Begin a transaction for vacuuming this relation */
    StartTransactionCommand();
    /* functions in indexes may want a snapshot set */
    ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());

This seems more future-proof.  The patch as proposed is assuming a whole
lot about where snapshots might or might not get used.

            regards, tom lane

Re: segfault of autovacuum process during restore - coredumps included

From
Frank van Vugt
Date:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > The attached patch should correct the problem, but I'd like to make sure
> > it does ...

> Rather than that, I'd suggest just setting ActiveSnapshot
> unconditionally after each of the StartTransactionCommand calls in
> autovacuum.c, ie make the code look just like vacuum.c:
>
>     /* Begin a transaction for vacuuming this relation */
>     StartTransactionCommand();
>     /* functions in indexes may want a snapshot set */
>     ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
>
> This seems more future-proof.  The patch as proposed is assuming a whole
> lot about where snapshots might or might not get used.

Will try the patch tonight.

Tom, is your patch meant for the exact same location? Also, don't we need a
'CommitTransactionCommand()' as well?





--
Best,




Frank.

Re: segfault of autovacuum process during restore - coredumps included

From
Tom Lane
Date:
Frank van Vugt <ftm.van.vugt@foxi.nl> writes:
> Tom, is your patch meant for the exact same location?

No.  There are two StartTransactionCommand calls in autovacuum.c, and
what I'm suggesting is to add the ActiveSnapshot assignment after each
one.

            regards, tom lane

Re: segfault of autovacuum process during restore - coredumps included

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > The attached patch should correct the problem, but I'd like to make sure
> > it does ...
>
> Rather than that, I'd suggest just setting ActiveSnapshot
> unconditionally after each of the StartTransactionCommand calls in
> autovacuum.c, ie make the code look just like vacuum.c:
>
>     /* Begin a transaction for vacuuming this relation */
>     StartTransactionCommand();
>     /* functions in indexes may want a snapshot set */
>     ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());

Done.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: segfault of autovacuum process during restore - coredumps included

From
Frank van Vugt
Date:
Alvaro,

> Hum, I'm unable to reproduce the problem here; could you give me the
> code for the functions in the functional indexes for tables 164956 or
> 227469?  They must be non-trivial SQL functions AFAICT (I'm rather
> unable to figure out their shape based only on the backtrace.)

Obviously the oid's have changed ;), but the relid in frame 13 of the latest
backtrace:

#13 0x080febd6 in analyze_rel (relid=268480, vacstmt=0x4422f4a0) at
analyze.c:257

is referring to a table 'purchaseorder_line' which has one functional index
defined like this (taken from '\d' output):

    "purchaseorder_line_idx01" btree (salesorder_line_id)
    WHERE status_id <> pol_stat('POL_CANCELLED'::character varying)
        AND status_id <> pol_stat('POL_HANDLED'::character varying)

The pol_stat() function is a helper-function used to refer to statusses by
abbreviation instead of id and is defined as:

    CREATE OR REPLACE FUNCTION pol_stat(varchar)
        RETURNS int
        LANGUAGE 'sql'
        IMMUTABLE
        STRICT
        SECURITY INVOKER
        AS 'SELECT id FROM purchaseorder_line_status WHERE abbreviation = $1';


If you were interested in some other relid, just let me know.



--
Best,




Frank.

Re: segfault of autovacuum process during restore - coredumps included

From
Frank van Vugt
Date:
<TL>
> No.  There are two StartTransactionCommand calls in autovacuum.c, and
> what I'm suggesting is to add the ActiveSnapshot assignment after each
> one.

<AH>
> Done.

I've changed autovacuum.c per this diff, I 'hope' I misinterpreted what needed
to be done (see below):

======================================================
-- autovacuum.c_orig   2005-11-28 16:34:49.000000000 +0100
+++ autovacuum.c        2005-11-28 16:37:11.000000000 +0100
@@ -494,6 +494,9 @@
        /* Start a transaction so our commands have one to play into. */
        StartTransactionCommand();

+       /* Begin a transaction for vacuuming this database */
+       ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
+
        dbRel = heap_open(DatabaseRelationId, AccessShareLock);

        /* Must use a table scan, since there's no syscache for pg_database */
@@ -555,6 +558,9 @@
        /* Start a transaction so our commands have one to play into. */
        StartTransactionCommand();

+       /* Begin a transaction for vacuuming this relation */
+       ActiveSnapshot = CopySnapshot(GetTransactionSnapshot());
+
        /*
         * StartTransactionCommand and CommitTransactionCommand will
automatically
         * switch to other contexts.  We need this one to keep the list of
======================================================

Both the first and second attempt to restore my development database
failed..... ;(


Here's a new backtrace:


#0  0x0825361c in CopySnapshot (snapshot=0x0) at tqual.c:1301
1301            newsnap = (Snapshot) palloc(sizeof(SnapshotData) +
(gdb) bt
#0  0x0825361c in CopySnapshot (snapshot=0x0) at tqual.c:1301
#1  0x0814673b in fmgr_sql (fcinfo=0xbfa79e20) at functions.c:319
#2  0x081400bc in ExecMakeFunctionResult (fcache=0x84e3780,
econtext=0x84e3bd8, isNull=0xbfa7a09b "<\b", isDone=0x0) at execQual.c:1096
#3  0x081426d6 in ExecEvalExprSwitchContext (expression=0x850dfdc,
econtext=0x0, isNull=0xbfa7a09b "<\b", isDone=0x0) at execQual.c:2865
#4  0x08189e53 in evaluate_expr (expr=0x84e3780, result_type=23) at
clauses.c:2646
#5  0x0818b8e9 in simplify_function (funcid=267367, result_type=23,
args=0x84ed6d0, allow_inline=1 '\001', context=0xbfa7a2b0) at clauses.c:2260
#6  0x0818bdea in eval_const_expressions_mutator (node=0x84ed454,
context=0xbfa7a2b0) at clauses.c:1305
#7  0x0818a5bd in expression_tree_mutator (node=0x84e35a0, mutator=0x818bc10
<eval_const_expressions_mutator>, context=0xbfa7a2b0) at clauses.c:3473
#8  0x0818be4d in eval_const_expressions_mutator (node=0x84e3694,
context=0xbfa7a2b0) at clauses.c:1335
#9  0x0818c431 in eval_const_expressions_mutator (node=0x84ed614,
context=0xbfa7a2b0) at clauses.c:2030
#10 0x0818ca35 in eval_const_expressions (node=0x84e36c0) at clauses.c:1211
#11 0x0822fd83 in RelationGetIndexPredicate (relation=0x4433f40c) at
relcache.c:2790
#12 0x080cb49f in BuildIndexInfo (index=0x4433f40c) at index.c:900
#13 0x080febd6 in analyze_rel (relid=268480, vacstmt=0x4422f4a0) at
analyze.c:257
#14 0x0813728b in vacuum (vacstmt=0x4422f4a0, relids=0x44228dcc) at
vacuum.c:476
#15 0x0819350f in autovacuum_do_vac_analyze (relids=0x4422e528, dovacuum=0
'\0', doanalyze=1 '\001', freeze=0 '\0') at autovacuum.c:906
#16 0x08193ecd in AutoVacMain (argc=0, argv=0x0) at autovacuum.c:680
#17 0x08194216 in autovac_start () at autovacuum.c:170
#18 0x08199fb1 in ServerLoop () at postmaster.c:1268
#19 0x0819b142 in PostmasterMain (argc=3, argv=0x833b8a0) at postmaster.c:943
#20 0x0815bece in main (argc=3, argv=0x833b8a0) at main.c:256








--
Best,




Frank.

Re: segfault of autovacuum process during restore - coredumps included

From
Alvaro Herrera
Date:
Frank van Vugt wrote:
> <TL>
> > No.  There are two StartTransactionCommand calls in autovacuum.c, and
> > what I'm suggesting is to add the ActiveSnapshot assignment after each
> > one.
>
> I've changed autovacuum.c per this diff, I 'hope' I misinterpreted what needed
> to be done (see below):

No, that diff is exactly what I applied.  But I think you must have done
something else wrong, because while I can reproduce the crash easily on
the unpatched tree with your example, it certainly does not crash on CVS
HEAD.

Test case attached for the curious ...

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachment

Re: segfault of autovacuum process during restore - coredumps included

From
Frank van Vugt
Date:
> > I've changed autovacuum.c per this diff, I 'hope' I misinterpreted what
> > needed to be done (see below):
>
> No, that diff is exactly what I applied.  But I think you must have done
> something else wrong

That's a bit harsh ;)

> , because while I can reproduce the crash easily on
> the unpatched tree with your example, it certainly does not crash on CVS
> HEAD.

Or HEAD contains something that either fixes or masks what's going on with my
v8.1 install........

FYI, it's a plain v8.1 with the given patch on autovacuum.c and an earlier
patch from Tom for the distinct bug which only touches ExecMain.c and
ExecUtils.c

> Test case attached for the curious ...

When I run that inside a transaction and wait for the autovacuum to kick in,
I'm not seeing any problem either.

Still, when restoring the larger dump, I immediately saw the same segfault. In
order to be sure I didn't forget it earlier, I did a make clean / make
install and tried again with the same result.

Given the original incidental nature of the problem combined with the fact
that I now get four coredumps out of four restores, I'd have to say the patch
is doing something alright ;).


The latest coredump refers to relid 284864 which again is the
purchaseorder_line table:

#0  0x0825361c in CopySnapshot (snapshot=0x0) at tqual.c:1301
1301            newsnap = (Snapshot) palloc(sizeof(SnapshotData) +
(gdb) bt
#0  0x0825361c in CopySnapshot (snapshot=0x0) at tqual.c:1301
#1  0x0814673b in fmgr_sql (fcinfo=0xbf9b2180) at functions.c:319
#2  0x081400bc in ExecMakeFunctionResult (fcache=0x847a598,
econtext=0x847a9f0, isNull=0xbf9b23fb "HrA\b", isDone=0x0) at execQual.c:1096
#3  0x081426d6 in ExecEvalExprSwitchContext (expression=0x847bcb4,
econtext=0x0, isNull=0xbf9b23fb "HrA\b", isDone=0x0) at execQual.c:2865
#4  0x08189e53 in evaluate_expr (expr=0x847a598, result_type=23) at
clauses.c:2646
#5  0x0818b8e9 in simplify_function (funcid=283751, result_type=23,
args=0x84685c0, allow_inline=1 '\001', context=0xbf9b2610) at clauses.c:2260
#6  0x0818bdea in eval_const_expressions_mutator (node=0x846a29c,
context=0xbf9b2610) at clauses.c:1305
#7  0x0818a5bd in expression_tree_mutator (node=0x8468be8, mutator=0x818bc10
<eval_const_expressions_mutator>, context=0xbf9b2610) at clauses.c:3473
#8  0x0818be4d in eval_const_expressions_mutator (node=0x846b128,
context=0xbf9b2610) at clauses.c:1335
#9  0x0818c431 in eval_const_expressions_mutator (node=0x84686f0,
context=0xbf9b2610) at clauses.c:2030
#10 0x0818ca35 in eval_const_expressions (node=0x8468c04) at clauses.c:1211
#11 0x0822fd83 in RelationGetIndexPredicate (relation=0x4423f4e8) at
relcache.c:2790
#12 0x080cb49f in BuildIndexInfo (index=0x4423f4e8) at index.c:900
#13 0x080febd6 in analyze_rel (relid=284864, vacstmt=0x44246058) at
analyze.c:257
#14 0x0813728b in vacuum (vacstmt=0x44246058, relids=0x44130d4c) at
vacuum.c:476
#15 0x0819350f in autovacuum_do_vac_analyze (relids=0x44245fd0, dovacuum=0
'\0', doanalyze=1 '\001', freeze=0 '\0') at autovacuum.c:906
#16 0x08193ecd in AutoVacMain (argc=0, argv=0x0) at autovacuum.c:680
#17 0x08194216 in autovac_start () at autovacuum.c:170
#18 0x08199fb1 in ServerLoop () at postmaster.c:1268
#19 0x0819b142 in PostmasterMain (argc=3, argv=0x833b8a0) at postmaster.c:943
#20 0x0815bece in main (argc=3, argv=0x833b8a0) at main.c:256


FYI, this table also has a non-standard deferred trigger defined on it:

(from '\d')
"RI_ConstraintTrigger_253215" AFTER INSERT OR DELETE OR UPDATE ON
purchaseorder_line DEFERRABLE INITIALLY DEFERRED FOR EACH ROW EXECUTE
PROCEDURE tr_purchaseorder_line_def()

Which is originally defined by:

CREATE OR REPLACE FUNCTION tr_purchaseorder_line_def()
    RETURNS trigger
    LANGUAGE 'plpgsql'
    VOLATILE
    STRICT
    SECURITY INVOKER
    AS '    DECLARE
        BEGIN
            <lots of code>
            RETURN NULL;
        END;';


CREATE CONSTRAINT TRIGGER purchaseorder_line_def AFTER INSERT OR UPDATE OR
DELETE ON purchaseorder_line DEFERRABLE INITIALLY DEFERRED FOR EACH ROW
EXECUTE PROCEDURE tr_purchaseorder_line_def();






--
Best,




Frank.

Re: segfault of autovacuum process during restore - coredumps included

From
Tom Lane
Date:
Frank van Vugt <ftm.van.vugt@foxi.nl> writes:
> FYI, it's a plain v8.1 with the given patch on autovacuum.c and an earlier
> patch from Tom for the distinct bug which only touches ExecMain.c and
> ExecUtils.c

Hm, I'm wondering if the toast-vs-index bug could be relevant.  Could
you try 8.1 branch tip?  Or at least apply this patch:

http://archives.postgresql.org/pgsql-committers/2005-11/msg00439.php

            regards, tom lane

Re: segfault of autovacuum process during restore - coredumps included

From
Frank van Vugt
Date:
> Hm, I'm wondering if the toast-vs-index bug could be relevant.  Could
> you try 8.1 branch tip?  Or at least apply this patch:
>
> http://archives.postgresql.org/pgsql-committers/2005-11/msg00439.php

Hmm, the patch mentioned for ExecMain.c won't apply to my base version, so I
went for a fresh cvs checkout of REL8_1_STABLE instead. It took a while,
since Slack's version of Bison is too old and I needed to get a more recent
one.

Anyway, the restore still fails, backtrace against cvs now looks like this:


#0  0x0825382c in CopySnapshot (snapshot=0x0) at tqual.c:1301
1301            newsnap = (Snapshot) palloc(sizeof(SnapshotData) +
(gdb) where
#0  0x0825382c in CopySnapshot (snapshot=0x0) at tqual.c:1301
#1  0x081467db in fmgr_sql (fcinfo=0xbff21cb0) at functions.c:319
#2  0x0814017c in ExecMakeFunctionResult (fcache=0x84ce4a8,
econtext=0x84ce900, isNull=0xbff21f2b "$\033=\b", isDone=0x0) at
execQual.c:1095
#3  0x08142796 in ExecEvalExprSwitchContext (expression=0x84d04f4,
econtext=0x0, isNull=0xbff21f2b "$\033=\b", isDone=0x0) at execQual.c:2864
#4  0x08189f23 in evaluate_expr (expr=0x84ce4a8, result_type=23) at
clauses.c:2646
#5  0x0818b9b9 in simplify_function (funcid=291943, result_type=23,
args=0x84c60c0, allow_inline=1 '\001', context=0xbff22140) at clauses.c:2260
#6  0x0818beba in eval_const_expressions_mutator (node=0x84c5e44,
context=0xbff22140) at clauses.c:1305
#7  0x0818a68d in expression_tree_mutator (node=0x84c4688, mutator=0x818bce0
<eval_const_expressions_mutator>, context=0xbff22140) at clauses.c:3473
#8  0x0818bf1d in eval_const_expressions_mutator (node=0x84c477c,
context=0xbff22140) at clauses.c:1335
#9  0x0818c501 in eval_const_expressions_mutator (node=0x84c6004,
context=0xbff22140) at clauses.c:2030
#10 0x0818cb05 in eval_const_expressions (node=0x84c47a8) at clauses.c:1211
#11 0x0822ff93 in RelationGetIndexPredicate (relation=0x442fb9f4) at
relcache.c:2790
#12 0x080cb4cf in BuildIndexInfo (index=0x442fb9f4) at index.c:900
#13 0x080fec96 in analyze_rel (relid=293056, vacstmt=0x4422628c) at
analyze.c:257
#14 0x0813734b in vacuum (vacstmt=0x4422628c, relids=0x4421d89c) at
vacuum.c:476
#15 0x0819365f in autovacuum_do_vac_analyze (relids=0x44224154, dovacuum=0
'\0', doanalyze=1 '\001', freeze=0 '\0') at autovacuum.c:907
#16 0x0819401d in AutoVacMain (argc=0, argv=0x0) at autovacuum.c:681
#17 0x08194366 in autovac_start () at autovacuum.c:170
#18 0x0819a101 in ServerLoop () at postmaster.c:1269
#19 0x0819b292 in PostmasterMain (argc=3, argv=0x833b8a0) at postmaster.c:943
#20 0x0815bf3e in main (argc=3, argv=0x833b8a0) at main.c:256

Again and still, relation 293056 referred to in frame 13 is the
purchaseorder_line table.



--
Best,




Frank.

Re: segfault of autovacuum process during restore - coredumps included

From
Tom Lane
Date:
Frank van Vugt <ftm.van.vugt@foxi.nl> writes:
> Hmm, the patch mentioned for ExecMain.c won't apply to my base version, so I
> went for a fresh cvs checkout of REL8_1_STABLE instead. It took a while,
> since Slack's version of Bison is too old and I needed to get a more recent
> one.
> Anyway, the restore still fails, backtrace against cvs now looks like this:

Can you break out a test case from the restore script and send it to
Alvaro and me?  If it's still failing on CVS tip then there's something
else going on here ...

            regards, tom lane

Re: segfault of autovacuum process during restore - coredumps included

From
Frank van Vugt
Date:
> Can you break out a test case from the restore script and send it to
> Alvaro and me?  If it's still failing on CVS tip then there's something
> else going on here ...

Mmm, I tried and created a dump of the purchaseorder_line table only, but its
restore didn't fail in 50+ attempts ;(. Obviously the restore doesn't take
very long, so it may be a matter of just missing the window of opportunity,
or it might be related to something that wasn't in that dump. Apart from
that, providing the complete dump would be difficult due to nda-restrictions
anyway.

Would you be interested in the coredump itself (5MB bzipped) or can I do
anything else to assist?





--
Best,




Frank.