Thread: Postgres bug (working with iserverd)

Postgres bug (working with iserverd)

From
"A.V.Shutko"
Date:
Hello , mans

Your server have a bug that sometimes cause coredumps.....

System:    FreeBSD 4.2-20001127-STABLE
Compiler:  gcc 2.95.2
Platform:  x86 (PII-600)
Ram:       256 Mb

With server work only one program - IServerd, there is 10 parallel
processes that have db connection.

Here information that i get from cores

/------------------------------------------------
#0  0x80c11b3 in exec_append_initialize_next ()
#1  0x80c1287 in ExecInitAppend ()
#2  0x80be3ee in ExecInitNode ()
#3  0x80be239 in EvalPlanQual ()
#4  0x80bdbb3 in ExecReplace ()
#5  0x80bd871 in ExecutePlan ()
#6  0x80bccea in ExecutorRun ()
#7  0x81036fb in ProcessQuery ()
#8  0x810217d in pg_exec_query_string ()
#9  0x81031ac in PostgresMain ()
#10 0x80eda66 in DoBackend ()
#11 0x80ed622 in BackendStartup ()
#12 0x80ec815 in ServerLoop ()
#13 0x80ec1fb in PostmasterMain ()
#14 0x80cd0a8 in main ()
#15 0x8064765 in _start ()
/------------------------------------------------

Dump of assembler code for function exec_append_initialize_next:
0x80c1174 <exec_append_initialize_next>:        push   %ebp
0x80c1175 <exec_append_initialize_next+1>:      mov    %esp,%ebp
0x80c1177 <exec_append_initialize_next+3>:      push   %esi
0x80c1178 <exec_append_initialize_next+4>:      push   %ebx
0x80c1179 <exec_append_initialize_next+5>:      mov    0x8(%ebp),%esi
0x80c117c <exec_append_initialize_next+8>:      mov    0x20(%esi),%ebx
0x80c117f <exec_append_initialize_next+11>:     mov    0x54(%esi),%edx
0x80c1182 <exec_append_initialize_next+14>:     mov    0x18(%edx),%eax
0x80c1185 <exec_append_initialize_next+17>:     mov    0x1c(%edx),%ecx
0x80c1188 <exec_append_initialize_next+20>:     test   %eax,%eax
0x80c118a <exec_append_initialize_next+22>:
    jge    0x80c1198 <exec_append_initialize_next+36>
0x80c118c <exec_append_initialize_next+24>:     movl   $0x0,0x18(%edx)
0x80c1193 <exec_append_initialize_next+31>:     xor    %eax,%eax
0x80c1195 <exec_append_initialize_next+33>:
    jmp    0x80c11be <exec_append_initialize_next+74>
0x80c1197 <exec_append_initialize_next+35>:     nop
0x80c1198 <exec_append_initialize_next+36>:     cmp    %ecx,%eax
0x80c119a <exec_append_initialize_next+38>:
    jl     0x80c11a4 <exec_append_initialize_next+48>
0x80c119c <exec_append_initialize_next+40>:     dec    %ecx
0x80c119d <exec_append_initialize_next+41>:     mov    %ecx,0x18(%edx)
0x80c11a0 <exec_append_initialize_next+44>:     xor    %eax,%eax
0x80c11a2 <exec_append_initialize_next+46>:
    jmp    0x80c11be <exec_append_initialize_next+74>
0x80c11a4 <exec_append_initialize_next+48>:     cmpb   $0x0,0x50(%esi)
0x80c11a8 <exec_append_initialize_next+52>:
    je     0x80c11b9 <exec_append_initialize_next+69>
0x80c11aa <exec_append_initialize_next+54>:     shl    $0x5,%eax
0x80c11ad <exec_append_initialize_next+57>:     add    0x10(%ebx),%eax
0x80c11b0 <exec_append_initialize_next+60>:     mov    %eax,0x18(%ebx)
0x80c11b3 <exec_append_initialize_next+63>:     mov    0x1c(%eax),%eax
  ^^^^^^ - <<<<<<<<<< here >>>>>>>>>>

0x80c11b6 <exec_append_initialize_next+66>:     mov    %eax,0x1c(%ebx)
0x80c11b9 <exec_append_initialize_next+69>:     mov    $0x1,%eax
0x80c11be <exec_append_initialize_next+74>:     pop    %ebx
0x80c11bf <exec_append_initialize_next+75>:     pop    %esi
0x80c11c0 <exec_append_initialize_next+76>:     leave
0x80c11c1 <exec_append_initialize_next+77>:     ret

With respect,
A.V.Shutko                          mailto:AVShutko@mail.khstu.ru

Re: Postgres bug (working with iserverd)

From
Tom Lane
Date:
"A.V.Shutko" <AVShutko@mail.khstu.ru> writes:
> Your server have a bug that sometimes cause coredumps.....

What version of postgres?  What is the query being processed?

            regards, tom lane

Re: Postgres bug (working with iserverd)

From
Tom Lane
Date:
"A.V.Shutko" <AVShutko@mail.khstu.ru> writes:
> Your server have a bug that sometimes cause coredumps.....

> Tom Lane> What version of postgres?
> # ./postgres -V
> postgres (PostgreSQL) 7.1

Okay, I think I understand the scenario here.  Are you using table
inheritance?  I can produce a crash in the same place using UPDATE
of an inheritance group:


regression=# create table par(f1 int);
CREATE
regression=# create table child(f2 int) inherits (par);
CREATE
regression=# insert into par values(1);
INSERT 1453231 1
regression=# begin;
BEGIN
regression=# update par set f1 = f1 + 1;
UPDATE 1

<< now start a second backend, and in it also do >>

regression=# update par set f1 = f1 + 1;

<< second backend blocks waiting for first one to commit;
   go back to first backend and do >>

regression=# end;
COMMIT

<< now second backend crashes in exec_append_initialize_next >>


The direct cause of the problem is that EvalPlanQual isn't completely
initializing the estate that it sets up for re-evaluating the plan.
In particular it's not filling in es_result_relations and
es_num_result_relations, which need to be set up if the top plan node
is an Append.  (That's probably my fault.)  But there are a bunch of
other fields that it's failing to copy, too.

Vadim, I'm thinking that EvalPlanQual would be better if it memcpy'd
the parent estate, and then changed the fields that should be different,
rather than zeroing the child state and then copying the fields that
need to be copied.  Seems like the default behavior should be to copy
fields rather than leave them zero.  What do you think?  Which fields
should really be zero in the child?

            regards, tom lane

Re: Postgres bug (working with iserverd)

From
Tom Lane
Date:
I wrote:
> The direct cause of the problem is that EvalPlanQual isn't completely
> initializing the estate that it sets up for re-evaluating the plan.
> In particular it's not filling in es_result_relations and
> es_num_result_relations, which need to be set up if the top plan node
> is an Append.  (That's probably my fault.)  But there are a bunch of
> other fields that it's failing to copy, too.

I believe I have fixed this problem in CVS sources for current and
REL7_1, at least to the extent that EvalPlanQual processing produces
the right answers for updates/deletes in inheritance trees.

However, EvalPlanQual still leaks more memory than suits me ---
auxiliary memory allocated by the plan nodes is not recovered.
I think the correct way to implement it would be to create a new
memory context for each level of EvalPlanQual execution and use
that context as the "per-query context" for the sub-query.  The
whole context (including the copied plan) would be freed at the
end of the sub-query.  The notion of a stack of currently-unused
epqstate nodes would go away.

This would mean a few more cycles per tuple to copy the plan tree over
again each time, but I think that's pretty trivial compared to the plan
startup/shutdown costs that we incur anyway.  Besides, I have hopes of
making plan trees read-only whenever we do the fabled querytree
redesign, so the cost will someday go away.

Comments, objections?
        regards, tom lane


Re: Postgres bug (working with iserverd)

From
Tom Lane
Date:
"Vadim Mikheev" <vmikheev@sectorbase.com> writes:
>> However, EvalPlanQual still leaks more memory than suits me ---
>> auxiliary memory allocated by the plan nodes is not recovered.

> Isn't plan shutdown supposed to free memory?

Yeah, but it leaks all over the place; none of the plan node types
bother to free their state nodes, for example.  There are lots of other
cases.  You really have to reset the per-query context to get rid of all
the cruft allocated during ExecInitNode.

> How subselects run queries again and again?

They don't end and restart them; they just rescan them.  If we had
this substitute-a-new-tuple hack integrated into the Param mechanism,
then EvalPlanQual could use ExecReScan too, but at the moment no...
        regards, tom lane


Re: Postgres bug (working with iserverd)

From
"Vadim Mikheev"
Date:
> > How subselects run queries again and again?
> 
> They don't end and restart them; they just rescan them.  If we had

Thanks for recollection.

> this substitute-a-new-tuple hack integrated into the Param mechanism,
> then EvalPlanQual could use ExecReScan too, but at the moment no...

I see.

Vadim




Re: Postgres bug (working with iserverd)

From
"Vadim Mikheev"
Date:
> However, EvalPlanQual still leaks more memory than suits me ---
> auxiliary memory allocated by the plan nodes is not recovered.
> I think the correct way to implement it would be to create a new
> memory context for each level of EvalPlanQual execution and use
> that context as the "per-query context" for the sub-query.  The
> whole context (including the copied plan) would be freed at the
> end of the sub-query.  The notion of a stack of currently-unused
> epqstate nodes would go away.
> 
> This would mean a few more cycles per tuple to copy the plan tree over
> again each time, but I think that's pretty trivial compared to the plan
> startup/shutdown costs that we incur anyway.  Besides, I have hopes of
> making plan trees read-only whenever we do the fabled querytree
> redesign, so the cost will someday go away.

Isn't plan shutdown supposed to free memory? How subselects run queries
again and again? I wasn't in planner/executor areas for long time and
have no time to look there now -:(, so - just asking -:)

Vadim