Thread: The Data Base System is in recovery mode

The Data Base System is in recovery mode

From
Palle Girgensohn
Date:
Hi!

I have a problem with postgresl-7.0.2 on Freebsd 4-stable.

Somehow, shared memory gets corrupted by a certain type of query, and
all databases end up unusable until I kill -9 the bad postgres process
and stop+restart postgres.

I mistakenly ran postgres without -N and -B flags, and I guess the
problem will be solved by raising these parameters, right?

/Palle

query: SELECT alignment, derived_from, group_id, meta_info, meta_name, meta_type, owner, perm, timestamp, type FROM
componentWHERE id=4344 
ProcessQuery
CommitTransactionCommand
type 25 :restypmod -1 :resname fnamn :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 3
:varattno3 :vartype 25 :vartypmod -1  :varl 
evelsup 0 :varnoold 3 :varoattno 3}} { TARGETENTRY :resdom { RESDOM :resno 4 :restype 25 :restypmod -1 :resname enamn
:reskey0 :reskeyop 0 :ressortgroupref 0 : 
resjunk false } :expr { VAR :varno 3 :varattno 4 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno 4}} {
TARGETENTRY:resdom { RESDOM :resno 5 :r 
estype 25 :restypmod -1 :resname titel :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 3
:varattno5 :vartype 25 :vartypmod -1  :va 
rlevelsup 0 :varnoold 3 :varoattno 5}} { TARGETENTRY :resdom { RESDOM :resno 6 :restype 25 :restypmod -1 :resname
telefon:reskey 0 :reskeyop 0 :ressortgroupref 
 0 :resjunk false } :expr { VAR :varno 3 :varattno 10 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno
10}}{ TARGETENTRY :resdom { RESDOM :resn 
o 7 :restype 25 :restypmod -1 :resname mobil :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR
:varno3 :varattno 12 :vartype 25 :vartypmod  
-1  :varlevelsup 0 :varnoold 3 :varoattno 12}} { TARGETENTRY :resdom { RESDOM :resno 8 :restype 25 :restypmod -1
:resnameemail :reskey 0 :reskeyop 0 :ressortgr 
oupref 0 :resjunk false } :expr { VAR :varno 3 :varattno 13 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 3
:varoattno13}} { TARGETENTRY :resdom { RESDOM 
 :resno 9 :restype 1184 :restypmod -1 :resname tidpunkt :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr
{VAR :varno 3 :varattno 16 :vartype 118 
4 :vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno 16}}) :qual { EXPR :typeOid 16  :opType and :oper <> :args ({
CONST:consttype 16 :constlen 1 :constisnul 
l false :constvalue  1 [ 1 0 0 0 ]  :constbyval true } { EXPR :typeOid 16  :opType op :oper { OPER :opno 1209 :opid 0
:opresulttype16 } :args ({ EXPR :typeOid  
25  :opType func :oper { FUNC :funcid 870 :functype 25 :funcisindex false :funcsize 0  :func_fcache @ 0x0 :func_tlist
({TARGETENTRY :resdom { RESDOM :resno 1 : 
restype 25 :restypmod -1 :resname \\<noname> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR
:varno-1 :varattno 1 :vartype 25 :vartypmod  
-1  :varlevelsup 0 :varnoold -1 :varoattno 1}}) :func_planlist <>} :args ({ VAR :varno 3 :varattno 4 :vartype 25
:vartypmod-1  :varlevelsup 0 :varnoold 3 :varo 
attno 4})} { EXPR :typeOid 25  :opType func :oper { FUNC :funcid 870 :functype 25 :funcisindex false :funcsize 0
:func_fcache@ 0x0 :func_tlist ({ TARGETENTRY  
:resdom { RESDOM :resno 1 :restype 25 :restypmod -1 :resname \\<noname> :reskey 0 :reskeyop 0 :ressortgroupref 0
:resjunkfalse } :expr { VAR :varno -1 :varattn 
o 1 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold -1 :varoattno 1}}) :func_planlist <>} :
ProcessQuery
CommitTransactionCommand
StartTransactionCommand
query: begin transaction
ProcessUtility: begin transaction
CommitTransactionCommand
StartTransactionCommand
query: declare curse cursor for select personid,foretag,fnamn,enamn,titel,telefon,mobil,email,tidpunkt, lower(foretag)
fromwtabmaria order by lower(foretag) 
ProcessQuery
CommitTransactionCommand
StartTransactionCommand
query: fetch forward 20 from curse
ProcessUtility: fetch forward 20 from curse
Server process (pid 15342) exited with status 139 at Tue Oct 17 14:37:20 2000
Terminating any active server processes...
NOTICE:  Message from PostgreSQL backend:
        The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory.
        I have rolled back the current transaction and am going to terminate your database system connection and exit.
        Please reconnect to the database system and repeat your query.
NOTICE:  Message from PostgreSQL backend:
        The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory.
        I have rolled back the current transaction and am going to terminate your database system connection and exit.
        Please reconnect to the database system and repeat your query.
...
The Data Base System is in recovery mode
The Data Base System is in recovery mode
The Data Base System is in recovery mode
[repeatedly, until kill -9]

--
         Partitur Informationsteknik AB
Wenner-Gren Center             +46 8 566 280 02
113 46 Stockholm           +46 70 785 86 02
Sweden                   girgen@partitur.se

Re: The Data Base System is in recovery mode

From
Tom Lane
Date:
Palle Girgensohn <girgen@partitur.se> writes:
> I have a problem with postgresl-7.0.2 on Freebsd 4-stable.

> query: declare curse cursor for select personid,foretag,fnamn,enamn,titel,telefon,mobil,email,tidpunkt,
lower(foretag)from wtabmaria order by lower(foretag) 
> query: fetch forward 20 from curse
> Server process (pid 15342) exited with status 139 at Tue Oct 17 14:37:20 2000

Hm.  I couldn't duplicate this crash using
7.0.2-plus-some-7.0.3-patches.  However I don't recall any bug fixes for
cursors in the 7.0.* branch.  Could you provide a more complete bug
report, like the complete schema for the table?  Also, can you provide
a gdb traceback from the corefile that the crashing backend hopefully
left behind in the database subdirectory ($PGDATA/data/base/yourdb)?

            regards, tom lane

Re: The Data Base System is in recovery mode

From
Palle Girgensohn
Date:
Hi!

If needed, I'll try to provoke the error again, first recompiling with
-g. I will probably not find time for it before the week-end, though.

New facts: The problem "moved" when raising the -B from default (64)
to 1000 (-N 100 now). Now, here's what happens:

query: CREATE VIEW wtabmaria AS SELECT
p.personid,p.foretag,p.fnamn,p.enamn,p.titel,p.telefon,p.mobil,p.email,p.tidpunktFROM personer p WHERE true AND low 
er(p.enamn)~~lower('Branner%')
ProcessUtility: CREATE VIEW wtabmaria AS SELECT
p.personid,p.foretag,p.fnamn,p.enamn,p.titel,p.telefon,p.mobil,p.email,p.tidpunktFROM personer p WHERE tru 
e AND lower(p.enamn)~~lower('Branner%')
query: INSERT INTO pg_rewrite (rulename, ev_type, ev_class, ev_attr, ev_action, ev_qual, is_instead) VALUES
('_RETwtabmaria',1::char, 6275026::oid, -1::int 
2, '({ QUERY :command 1  :utility <> :resultRelation 0 :into <> :isPortal false :isBinary false :isTemp false :unionall
false:distinctClause <> :sortClause 
 <> :rtable ({ RTE :relname wtabmaria :ref { ATTR :relname *CURRENT* :attrs <>} :relid 6275026 :inh false :inFromCl
false:inJoinSet false :skipAcl false} { 
 RTE :relname wtabmaria :ref { ATTR :relname *NEW* :attrs <>} :relid 6275026 :inh false :inFromCl false :inJoinSet
false:skipAcl false} { RTE :relname pers 
oner :ref { ATTR :relname p :attrs <>} :relid 820284 :inh false :inFromCl true :inJoinSet true :skipAcl false})
:targetlist({ TARGETENTRY :resdom { RESDOM  
:resno 1 :restype 23 :restypmod -1 :resname personid :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr {
VAR:varno 3 :varattno 1 :vartype 23  
:vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno 1}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 25 :restypmod
-1:resname foretag :reskey 0 :reske 
yop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 3 :varattno 2 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold3 :varoattno 2}} { TARGETENTR 
Y :resdom { RESDOM :resno 3 :restype 25 :restypmod -1 :resname fnamn :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk
false} :expr { VAR :varno 3 :varattn 
o 3 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno 3}} { TARGETENTRY :resdom { RESDOM :resno 4
:restype25 :restypmod -1 :resname enamn :r 
eskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 3 :varattno 4 :vartype 25 :vartypmod -1
:varlevelsup0 :varnoold 3 :varoattno 4} 
} { TARGETENTRY :resdom { RESDOM :resno 5 :restype 25 :restypmod -1 :resname titel :reskey 0 :reskeyop 0
:ressortgroupref0 :resjunk false } :expr { VAR :va 
rno 3 :varattno 5 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno 5}} { TARGETENTRY :resdom { RESDOM
:resno6 :restype 25 :restypmod -1 :re 
sname telefon :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 3 :varattno 10 :vartype 25
:vartypmod-1  :varlevelsup 0 :varnool 
d 3 :varoattno 10}} { TARGETENTRY :resdom { RESDOM :resno 7 :restype 25 :restypmod -1 :resname mobil :reskey 0
:reskeyop0 :ressortgroupref 0 :resjunk false 
 } :expr { VAR :varno 3 :varattno 12 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno 12}} {
TARGETENTRY:resdom { RESDOM :resno 8 :restype  
25 :restypmod -1 :resname email :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 3
:varattno13 :vartype 25 :vartypmod -1  :varl 
evelsup 0 :varnoold 3 :varoattno 13}} { TARGETENTRY :resdom { RESDOM :resno 9 :restype 1184 :restypmod -1 :resname
tidpunkt:reskey 0 :reskeyop 0 :ressortgr 
oupref 0 :resjunk false } :expr { VAR :varno 3 :varattno 16 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 3
:varoattno16}}) :qual { EXPR :typeOid 1 
6  :opType and :oper <> :args ({ CONST :consttype 16 :constlen 1 :constisnull false :constvalue  1 [ 1 0 0 0 ]
:constbyvaltrue } { EXPR :typeOid 16  :opTy 
pe op :oper { OPER :opno 1209 :opid 0 :opresulttype 16 } :args ({ EXPR :typeOid 25  :opType func :oper { FUNC :funcid
870:functype 25 :funcisindex false :f 
uncsize 0  :func_fcache @ 0x0 :func_tlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 25 :restypmod -1 :resname
\\<noname>:reskey 0 :reskeyop 0 :ress 
ortgroupref 0 :resjunk false } :expr { VAR :varno -1 :varattno 1 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold -1
:varoattno1}}) :func_planlist <>} : 
args ({ VAR :varno 3 :varattno 4 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno 4})} { EXPR :typeOid
25 :opType func :oper { FUNC :funcid 
 870 :functype 25 :funcisindex false :funcsize 0  :func_fcache @ 0x0 :func_tlist ({ TARGETENTRY :resdom { RESDOM :resno
1:restype 25 :restypmod -1 :resname 
 \\<noname> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno -1 :varattno 1 :vartype 25
:vartypmod-1  :varlevelsup 0 :varnoold  
-1 :varoattno 1}}) :func_planlist <>} :
ProcessQuery
CommitTransactionCommand
StartTransactionCommand
query: begin transaction
ProcessUtility: begin transaction
CommitTransactionCommand
StartTransactionCommand
query: declare curse cursor for select personid,foretag,fnamn,enamn,titel,telefon,mobil,email,tidpunkt, lower(foretag)
fromwtabmaria order by lower(foretag 
)
ProcessQuery
CommitTransactionCommand
StartTransactionCommand
query: fetch forward 20 from curse
ProcessUtility: fetch forward 20 from curse
CommitTransactionCommand
StartTransactionCommand
query: select distinct personid,foretag, lower(foretag) from wtabmaria order by lower(foretag)
ProcessQuery
Server process (pid 3096) exited with status 139 at Wed Oct 18 09:19:00 2000
Terminating any active server processes...
NOTICE:  Message from PostgreSQL backend:
        The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory.
        I have rolled back the current transaction and am going to terminate your database system connection and exit.
        Please reconnect to the database system and repeat your query.
NOTICE:  Message from PostgreSQL backend:
        The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory.
        I have rolled back the current transaction and am going to terminate your database system connection and exit.
        Please reconnect to the database system and repeat your query.
NOTICE:  Message from PostgreSQL backend:
        The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory.
        I have rolled back the current transaction and am going to terminate your database system connection and exit.
        Please reconnect to the database system and repeat your query.
Server processes were terminated at Wed Oct 18 09:19:01 2000
Reinitializing shared memory and semaphores
DEBUG:  Data Base System is starting up at Wed Oct 18 09:19:01 2000
DEBUG:  Data Base System was interrupted being in production at Wed Oct 18 01:12:29 2000
DEBUG:  Data Base System is in production state at Wed Oct 18 09:19:01 2000

It looks very much like an out-of-memory error, since now, the fetch
is ok, but it fails on the next query, which is a memory hog (~5500
records). I don't know why the progammer initially put the second
select inside the transaction, and I have no put an "abort" after the
"fetch", to see if I might get away with it now. It seems to work
(using -B 2000 too)...

The higher I set '-B', the harder it gets to reproduce the error. I
*think* I might be able to find a value for '-B' where I can just
lower it by one and get it to fail, (given no other users in the
database system).

Schema:

CREATE SEQUENCE "pidseq" start 9700 increment 1 maxvalue 2147483647 minvalue 1  cache 1 ;
CREATE TABLE "personer" (
        "personid" int4 DEFAULT nextval('pidseq'::text) NOT NULL,
        "foretag" text,
        "fnamn" text,
        "enamn" text,
        "titel" text,
        "adress1" text,
        "adress2" text,
        "postadress" text,
        "land" character(2) DEFAULT 'SE' NOT NULL,
        "telefon" text,
        "telefax" text,
        "mobil" text,
        "email" text,
        "privemail" text,
        "url" text,
        "tidpunkt" timestamp DEFAULT now() NOT NULL,
        "avdelning" text
);

There are about 5500 records in this table.

/Palle

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Palle Girgensohn <girgen@partitur.se> writes:
> > I have a problem with postgresl-7.0.2 on Freebsd 4-stable.
>
> > query: declare curse cursor for select personid,foretag,fnamn,enamn,titel,telefon,mobil,email,tidpunkt,
lower(foretag)from wtabmaria order by lower(foretag) 
> > query: fetch forward 20 from curse
> > Server process (pid 15342) exited with status 139 at Tue Oct 17 14:37:20 2000
>
> Hm.  I couldn't duplicate this crash using
> 7.0.2-plus-some-7.0.3-patches.  However I don't recall any bug fixes for
> cursors in the 7.0.* branch.  Could you provide a more complete bug
> report, like the complete schema for the table?  Also, can you provide
> a gdb traceback from the corefile that the crashing backend hopefully
> left behind in the database subdirectory ($PGDATA/data/base/yourdb)?
>
>             regards, tom lane

--
         Partitur Informationsteknik AB
Wenner-Gren Center             +46 8 566 280 02
113 46 Stockholm           +46 70 785 86 02
Sweden                   girgen@partitur.se

Re: The Data Base System is in recovery mode

From
Tom Lane
Date:
Palle Girgensohn <girgen@partitur.se> writes:
> New facts: The problem "moved" when raising the -B from default (64)
> to 1000 (-N 100 now). Now, here's what happens:

Interesting.

> query: CREATE VIEW wtabmaria AS SELECT
p.personid,p.foretag,p.fnamn,p.enamn,p.titel,p.telefon,p.mobil,p.email,p.tidpunktFROM personer p WHERE true AND low 
> er(p.enamn)~~lower('Branner%')

Hm, you had not mentioned before that wtabmaria is a view.  That may be
the critical factor.  However, I'm still not having any luck duplicating
the failure.

> It looks very much like an out-of-memory error,

I don't think I believe that; out-of-memory problems should be reported
as such.  Moreover, I see nothing in this query that would require
touching more than one disk buffer at a time.  Now I really want to see
the coredump backtrace...

> There are about 5500 records in this table.

Is that total in "personer", or total shown by the view?

            regards, tom lane

Re: The Data Base System is in recovery mode

From
Palle Girgensohn
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Palle Girgensohn <girgen@partitur.se> writes:
> > New facts: The problem "moved" when raising the -B from default (64)
> > to 1000 (-N 100 now). Now, here's what happens:
>
> Interesting.

I am also using -o -S 2048 now, but I doubt it helps...

> > query: CREATE VIEW wtabmaria AS SELECT
p.personid,p.foretag,p.fnamn,p.enamn,p.titel,p.telefon,p.mobil,p.email,p.tidpunktFROM personer p WHERE true AND low 
> > er(p.enamn)~~lower('Branner%')
>
> Hm, you had not mentioned before that wtabmaria is a view.  That may be
> the critical factor.  However, I'm still not having any luck duplicating
> the failure.

Oh, gee. sorry! I slipped with the mouse when cut'n'pasting. The first
statement should be a create view. The view is dropped and created
"on-the-fly" with different where-clauses every time, depending on the
user's search arguments. As I might have said before, I am not the
programmer... I think I would have the view created once, and done
"select from view where..." Maybe that would help, but there is still
a bug in postgres, I guess?

> > It looks very much like an out-of-memory error,

I must take that back, since the select from view only holds two
records... My mistake, again...

> I don't think I believe that; out-of-memory problems should be reported
> as such.  Moreover, I see nothing in this query that would require
> touching more than one disk buffer at a time.  Now I really want to see
> the coredump backtrace...
>
> > There are about 5500 records in this table.
>
> Is that total in "personer", or total shown by the view?

in "personer".

Also, I have suddenly become a "very lousy bug reporter", sorry. I forgot the indices:

CREATE  INDEX "personer_fnamn_idx" on "personer" using btree ( lower ("fnamn") "text_ops" );
CREATE UNIQUE INDEX "personer_personid_index" on "personer" using btree ( "personid" "int4_ops" );
CREATE  INDEX "personer_personid_fname_idx" on "personer" using btree ( "personid" "int4_ops", "fnamn" "text_ops" );

Cheers,
Palle

Re: The Data Base System is in recovery mode

From
Tom Lane
Date:
Palle Girgensohn <girgen@partitur.se> writes:
> Oh, gee. sorry! I slipped with the mouse when cut'n'pasting. The first
> statement should be a create view. The view is dropped and created
> "on-the-fly" with different where-clauses every time, depending on the
> user's search arguments. As I might have said before, I am not the
> programmer... I think I would have the view created once, and done
> "select from view where..." Maybe that would help, but there is still
> a bug in postgres, I guess?

I agree, that is a bizarre way to do things, not least because it
wouldn't work for multiple concurrent clients (unless the view name is
client-specific?).  But I don't see why it would provoke a crash.
Curiouser and curiouser.

The indexes are good to know about, but AFAICT they won't be used for
these particular queries.  So I'm still unable to duplicate the problem.
Looking forward to that backtrace...

            regards, tom lane

Re: The Data Base System is in recovery mode

From
Palle Girgensohn
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Palle Girgensohn <girgen@partitur.se> writes:
> Oh, gee. sorry! I slipped with the mouse when cut'n'pasting. The
> first > statement should be a create view. The view is dropped and
> created > "on-the-fly" with different where-clauses every time,
> depending on the > user's search arguments. As I might have said
> before, I am not the > programmer... I think I would have the view
> created once, and done > "select from view where..." Maybe that
> would help, but there is still > a bug in postgres, I guess?
>
> I agree, that is a bizarre way to do things, not least because it
> wouldn't work for multiple concurrent clients (unless the view name
> is client-specific?).

They *are* client specific. wtabmaria is "Maria's working table", sort
of...

> But I don't see why it would provoke a crash.
> Curiouser and curiouser.
>
> The indexes are good to know about, but AFAICT they won't be used
> for these particular queries.  So I'm still unable to duplicate the
> problem.  Looking forward to that backtrace...

I'll get on with it this weekend.

Cheers,
Palle

Re: The Data Base System is in recovery mode

From
Palle Girgensohn
Date:
Hi!

Here's a traceback from the time it failed, unfortunately without
debugging symbols. I cannot reproduce this error on a fresh
installation on another machine, which puzzles me. I though it might
have to do with the the database beeing dumped/restored to this new
installation, but I found no way to get the data there without
dump/restore. tarring the data/base/dbname dir doesn't work, right?

It is on FreeBSD 4.1.1, an SMP machine. Built with multibyte encoding,
default encoding set to LATIN1, (using the freebsd port).

Current directory is /usr/local/pgsql/bin/
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...
(no debugging symbols found)...
Core was generated by `postgres'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libcrypt.so.2...(no debugging symbols found)...
done.
Reading symbols from /usr/lib/libm.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libutil.so.3...(no debugging symbols found)...
done.
Reading symbols from /usr/lib/libreadline.so.4...
(no debugging symbols found)...done.
Reading symbols from /usr/lib/libncurses.so.5...(no debugging symbols found)...
done.
Reading symbols from /usr/lib/libc.so.4...(no debugging symbols found)...done.
Reading symbols from /usr/libexec/ld-elf.so.1...(no debugging symbols found)...
done.
#0  0x8136eb2 in GetTemplateEncoding ()
(gdb) bt
#0  0x8136eb2 in GetTemplateEncoding ()
#1  0x8136d66 in pg_mb2wchar_with_len ()
#2  0x810adab in int8_text ()
#3  0x810ae82 in textlike ()
#4  0x812ef14 in fmgr_c ()
#5  0x80a1839 in GetAttributeByName ()
#6  0x80a187f in GetAttributeByName ()
#7  0x80a1b63 in ExecEvalExpr ()
#8  0x80a1c36 in ExecQual ()
#9  0x80a20b7 in ExecScan ()
#10 0x80a701a in ExecSeqScan ()
#11 0x80a084d in ExecProcNode ()
#12 0x80a73f8 in ExecSort ()
#13 0x80a0891 in ExecProcNode ()
#14 0x80a766f in ExecUnique ()
#15 0x80a089d in ExecProcNode ()
#16 0x809fa19 in ExecutorEnd ()
#17 0x809ef6a in ExecutorRun ()
#18 0x80f67aa in ProcessPortal ()
#19 0x80f6827 in ProcessQuery ()
#20 0x80f522c in pg_exec_query_dest ()
#21 0x80f50f5 in pg_plan_query ()
#22 0x80f6202 in PostgresMain ()
#23 0x80deaf8 in PostmasterMain ()
#24 0x80de634 in PostmasterMain ()
#25 0x80dd815 in PostmasterMain ()
#26 0x80dd212 in PostmasterMain ()
#27 0x80afea7 in main ()
#28 0x80639b9 in _start ()
(gdb)


This core dump is not when the database ended up in recovery mode. As
you might remember, I fiddled with the memory settings, and the higher
they went, the more stable the system became.

Here's the postgres log (-d 2) from this occasion:

ProcessQuery
CommitTransactionCommand
StartTransactionCommand
query: select relname from pg_class where relname='wtabniclas'
ProcessQuery
CommitTransactionCommand
StartTransactionCommand
query: drop view wtabniclas
ProcessUtility: drop view wtabniclas
CommitTransactionCommand
StartTransactionCommand
query: CREATE VIEW wtabniclas AS SELECT
p.personid,p.foretag,p.fnamn,p.enamn,p.titel,p.telefon,p.mobil,p.email,p.tidpunktFROM personer p W\ 
HERE true AND lower(p.foretag)~~lower('Taxi%%')
ProcessUtility: CREATE VIEW wtabniclas AS SELECT
p.personid,p.foretag,p.fnamn,p.enamn,p.titel,p.telefon,p.mobil,p.email,p.tidpunktFROM per\ 
soner p WHERE true AND lower(p.foretag)~~lower('Taxi%%')
query: INSERT INTO pg_rewrite (rulename, ev_type, ev_class, ev_attr, ev_action, ev_qual, is_instead) VALUES
('_RETwtabniclas',1::char, 6282\ 
784::oid, -1::int2, '({ QUERY :command 1  :utility <> :resultRelation 0 :into <> :isPortal false :isBinary false
:isTempfalse :unionall fal\ 
se :distinctClause <> :sortClause <> :rtable ({ RTE :relname wtabniclas :ref { ATTR :relname *CURRENT* :attrs <>}
:relid6282784 :inh false \ 
:inFromCl false :inJoinSet false :skipAcl false} { RTE :relname wtabniclas :ref { ATTR :relname *NEW* :attrs <>} :relid
6282784:inh false :\ 
inFromCl false :inJoinSet false :skipAcl false} { RTE :relname personer :ref { ATTR :relname p :attrs <>} :relid 820284
:inhfalse :inFromCl\ 
 true :inJoinSet true :skipAcl false}) :targetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 23 :restypmod -1
:resnamepersonid :res\ 
key 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 3 :varattno 1 :vartype 23 :vartypmod -1
:varlevelsup0 :varnoold 3\ 
 :varoattno 1}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 25 :restypmod -1 :resname foretag :reskey 0 :reskeyop
0:ressortgroupref 0 \ 
:resjunk false } :expr { VAR :varno 3 :varattno 2 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno 2}}
{TARGETENTRY :resdom\ 
 { RESDOM :resno 3 :restype 25 :restypmod -1 :resname fnamn :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false }
:expr{ VAR :varno 3 :\ 
varattno 3 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno 3}} { TARGETENTRY :resdom { RESDOM :resno 4
:restype25 :restypm\ 
od -1 :resname enamn :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 3 :varattno 4
:vartype25 :vartypmod -1  :\ 
varlevelsup 0 :varnoold 3 :varoattno 4}} { TARGETENTRY :resdom { RESDOM :resno 5 :restype 25 :restypmod -1 :resname
titel:reskey 0 :reskeyo\ 
p 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 3 :varattno 5 :vartype 25 :vartypmod -1  :varlevelsup 0
:varnoold3 :varoattno 5}\ 
} { TARGETENTRY :resdom { RESDOM :resno 6 :restype 25 :restypmod -1 :resname telefon :reskey 0 :reskeyop 0
:ressortgroupref0 :resjunk false\ 
 } :expr { VAR :varno 3 :varattno 10 :vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno 10}} {
TARGETENTRY:resdom { RESDOM :r\ 
esno 7 :restype 25 :restypmod -1 :resname mobil :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR
:varno3 :varattno 12 \ 
:vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 3 :varoattno 12}} { TARGETENTRY :resdom { RESDOM :resno 8 :restype
25:restypmod -1 :res\ 
name email :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 3 :varattno 13 :vartype 25
:vartypmod-1  :varlevels\ 
up 0 :varnoold 3 :varoattno 13}} { TARGETENTRY :resdom { RESDOM :resno 9 :restype 1184 :restypmod -1 :resname tidpunkt
:reskey0 :reskeyop 0\ 
 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 3 :varattno 16 :vartype 1184 :vartypmod -1  :varlevelsup 0
:varnoold3 :varoattno 16\ 
}}) :qual { EXPR :typeOid 16  :opType and :oper <> :args ({ CONST :consttype 16 :constlen 1 :constisnull false
:constvalue 1 [ 1 0 0 0 ]  :\ 
constbyval true } { EXPR :typeOid 16  :opType op :oper { OPER :opno 1209 :opid 0 :opresulttype 16 } :args ({ EXPR
:typeOid25  :opType func \ 
:oper { FUNC :funcid 870 :functype 25 :funcisindex false :funcsize 0  :func_fcache @ 0x0 :func_tlist ({ TARGETENTRY
:resdom{ RESDOM :resno \ 
1 :restype 25 :restypmod -1 :resname \\<noname> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR
:varno-1 :varattno 1 \ 
:vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold -1 :varoattno 1}}) :func_planlist <>} :args ({ VAR :varno 3
:varattno2 :vartype 25 :var\ 
typmod -1  :varlevelsup 0 :varnoold 3 :varoattno 2})} { EXPR :typeOid 25  :opType func :oper { FUNC :funcid 870
:functype25 :funcisindex fa\ 
lse :funcsize 0  :func_fcache @ 0x0 :func_tlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 25 :restypmod -1
:resname\\<noname> :resk\ 
ey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno -1 :varattno 1 :vartype 25 :vartypmod -1
:varlevelsup0 :varnoold -\ 
1 :varoattno 1}}) :func_planlist <>
ProcessQuery
CommitTransactionCommand
StartTransactionCommand
query: begin transaction
ProcessUtility: begin transaction
CommitTransactionCommand
StartTransactionCommand
query: declare curse cursor for select personid,foretag,fnamn,enamn,titel,telefon,mobil,email,tidpunkt, lower(foretag)
fromwtabniclas order\ 
 by lower(foretag)
ProcessQuery
CommitTransactionCommand
StartTransactionCommand
query: fetch forward 20 from curse
ProcessUtility: fetch forward 20 from curse
CommitTransactionCommand
StartTransactionCommand
query: abort transaction
ProcessUtility: abort transaction
CommitTransactionCommand
StartTransactionCommand
query: select distinct personid,foretag, lower(foretag) from wtabniclas order by lower(foretag)
ProcessQuery
Server process (pid 6757) exited with status 139 at Wed Oct 18 14:51:53 2000
Terminating any active server processes...
Server processes were terminated at Wed Oct 18 14:51:53 2000
Reinitializing shared memory and semaphores
DEBUG:  Data Base System is starting up at Wed Oct 18 14:51:53 2000
DEBUG:  Data Base System was interrupted being in production at Wed Oct 18 14:51:21 2000
DEBUG:  Data Base System is in production state at Wed Oct 18 14:51:53 2000


Hope this helps...

/Palle


Tom Lane <tgl@sss.pgh.pa.us> writes:

> Palle Girgensohn <girgen@partitur.se> writes:
> > I have a problem with postgresl-7.0.2 on Freebsd 4-stable.
>
> > query: declare curse cursor for select personid,foretag,fnamn,enamn,titel,telefon,mobil,email,tidpunkt,
lower(foretag)from wtabmaria order by lower(foretag) 
> > query: fetch forward 20 from curse
> > Server process (pid 15342) exited with status 139 at Tue Oct 17 14:37:20 2000
>
> Hm.  I couldn't duplicate this crash using
> 7.0.2-plus-some-7.0.3-patches.  However I don't recall any bug fixes for
> cursors in the 7.0.* branch.  Could you provide a more complete bug
> report, like the complete schema for the table?  Also, can you provide
> a gdb traceback from the corefile that the crashing backend hopefully
> left behind in the database subdirectory ($PGDATA/data/base/yourdb)?
>
>             regards, tom lane

Re: The Data Base System is in recovery mode

From
Tom Lane
Date:
Palle Girgensohn <girgen@partitur.se> writes:
> I cannot reproduce this error on a fresh
> installation on another machine, which puzzles me. I though it might
> have to do with the the database beeing dumped/restored to this new
> installation, but I found no way to get the data there without
> dump/restore. tarring the data/base/dbname dir doesn't work, right?

No, but tarring the entire data directory should work (as long
as the destination machine is same architecture and same configure
options as source).  Partial tar doesn't work because you've got to have
pg_log matching the data files...

            regards, tom lane

Re: The Data Base System is in recovery mode

From
Tom Lane
Date:
Palle Girgensohn <girgen@partitur.se> writes:
> GNU gdb 4.18
> Program terminated with signal 11, Segmentation fault.
> (gdb) bt
> #0  0x8136eb2 in GetTemplateEncoding ()
> #1  0x8136d66 in pg_mb2wchar_with_len ()
> #2  0x810adab in int8_text ()
> #3  0x810ae82 in textlike ()
> #4  0x812ef14 in fmgr_c ()

I do not believe this backtrace.  textlike does not call int8_text;
int8_text does not call pg_mb2wchar_with_len; pg_mb2wchar_with_len
doesn't call GetTemplateEncoding; and GetTemplateEncoding is just
about as SEGV-proof as any routine I've ever seen:

static int    templateEncoding;
...
int
GetTemplateEncoding()
{
    return (templateEncoding);
}

I speculate that you gave gdb the wrong executable file to compare
to the core file ...

            regards, tom lane

Re: The Data Base System is in recovery mode

From
Palle Girgensohn
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> and GetTemplateEncoding is just about as SEGV-proof as any routine
> I've ever seen:
>
> static int    templateEncoding;
> ...
> int
> GetTemplateEncoding()
> {
>     return (templateEncoding);
> }

Heh

> I speculate that you gave gdb the wrong executable file to compare
> to the core file ...

I too had a hard time believeing it. I used xemacs (M-x gdb-with-core)
and used the core dump and /usr/local/pgsql/bin/postgres. I was
surprised that there was funcion names, since there shouldn't be any
(no debugging symbols). Hence I think too, this backtrace is
rubbish. :(

I'll see if I can get one tomorrow, but I will have to rebuild
postgres with -g and recreate the error. Problem is, I couldn't do it
on my machine. Odd...

I'll get back to you when I have more info.

Cheers,
Palle

Re: The Data Base System is in recovery mode

From
Tom Lane
Date:
Palle Girgensohn <girgen@partitur.se> writes:
> I was surprised that there was funcion names, since there shouldn't be
> any (no debugging symbols).

Not necessarily.  If you haven't applied strip(1) to the executable,
there will be function names in the backtrace on most platforms, even
without -g.  What -g adds is parameter info and file/line number info.

> Hence I think too, this backtrace is rubbish. :(

It is that :-(.  Better luck tomorrow.

            regards, tom lane

Re: The Data Base System is in recovery mode

From
Bruce Momjian
Date:
> Palle Girgensohn <girgen@partitur.se> writes:
> > GNU gdb 4.18
> > Program terminated with signal 11, Segmentation fault.
> > (gdb) bt
> > #0  0x8136eb2 in GetTemplateEncoding ()
> > #1  0x8136d66 in pg_mb2wchar_with_len ()
> > #2  0x810adab in int8_text ()
> > #3  0x810ae82 in textlike ()
> > #4  0x812ef14 in fmgr_c ()
>
> I do not believe this backtrace.  textlike does not call int8_text;

Only Tom Lane could question a backtrace and get away with it.  :-)

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: The Data Base System is in recovery mode

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> I do not believe this backtrace.  textlike does not call int8_text;

> Only Tom Lane could question a backtrace and get away with it.  :-)

If you haven't ever seen a bogus backtrace, you've not been using
debuggers long enough ;-).  gdb has no good way to know if you've
given it an executable that matches the corefile or not.  I *always*
check to see if I believe a backtrace before I get too bogged down
in possibly-phony details.

            regards, tom lane

Re: The Data Base System is in recovery mode

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> I do not believe this backtrace.  textlike does not call int8_text;
>
> > Only Tom Lane could question a backtrace and get away with it.  :-)
>
> If you haven't ever seen a bogus backtrace, you've not been using
> debuggers long enough ;-).  gdb has no good way to know if you've
> given it an executable that matches the corefile or not.  I *always*
> check to see if I believe a backtrace before I get too bogged down
> in possibly-phony details.

Yea, I know.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: The Data Base System is in recovery mode

From
The Hermit Hacker
Date:
On Sun, 22 Oct 2000, Bruce Momjian wrote:

> > Palle Girgensohn <girgen@partitur.se> writes:
> > > GNU gdb 4.18
> > > Program terminated with signal 11, Segmentation fault.
> > > (gdb) bt
> > > #0  0x8136eb2 in GetTemplateEncoding ()
> > > #1  0x8136d66 in pg_mb2wchar_with_len ()
> > > #2  0x810adab in int8_text ()
> > > #3  0x810ae82 in textlike ()
> > > #4  0x812ef14 in fmgr_c ()
> >
> > I do not believe this backtrace.  textlike does not call int8_text;
>
> Only Tom Lane could question a backtrace and get away with it.  :-)

Okay, *how* do you come to that conclusion?  Although Tom most likely has
that section of code programmed into his brain from all the work he's done
on the fmgr, all he did was trace the code by hand ... I'd give him
demi-god status for knowing the code such that he most likely knew it was
wrong as soon as he saw it (again, he's just spent alot of time in that
section of code), but beyond that, its a simple manual trace of the code
...