Strong feeling of something ugly lurking deeply within 7.0 ;-) - Mailing list pgsql-bugs

From Christof Petig
Subject Strong feeling of something ugly lurking deeply within 7.0 ;-)
Date
Msg-id 39D8FDCC.D43E7F77@wtal.de
Whole thread Raw
Responses Re: Strong feeling of something ugly lurking deeply within 7.0 ;-)
List pgsql-bugs
The severity of this bug heavily depends on your lack of buggy programs.

Short description:
Long standing open transactions combined with high traffic updates and
some regular vacuums eventually corrupt memory.

Long description:
Due to a design flaw within our ecpg Programs (I don't recommend
designing for autocommit off!) some transactions stayed open for several
days. A process data collection system generates a lot of status change
updates (3MB a day) to about 110 rows in a table at the same time.
After 1024 updates I vacuum the high traffic table which should shrink
to 16kB. First I noticed that vacuum did not free old tuples. This put
me on the track of the real cause.

Since three weeks (more buggy long standing transactions) I have seen
one major crash of the program system per week. For months I have seen
some strange NOTICES which went away after another vacuum. And this
morning I found a 'possible memory corruption, killing other backends'
message.

The situation got better and better during the 7.0 development cycle (I
started with a pre-beta version this January and reported some
concurrent vacuum oddities that time). And it got worse the more
interactive programs we added.
But up to now I didn't see the special addon which causes the pain: Long
standing transactions.

It's not very bad. This seems to happen on rare conditions. Until this
week I thought of it as a minor oddity - a temporary nuissance.

And: It is current stable CVS tree! running on a 233MHz Pentium2, Linux
2.2.14(?)

Sample Code:
    update bn_actual set meter=meter+1 where machine= ?; // repeat every
second
combined with
    begin transaction; // hold
    select something;
and
    vacuum analyze; // once a day
and
    vacuum bn_actual; // every 1024 updates

and some others.

PS: Of course I'm currently fixing the long transactions problem. I'll
tell you once the system runs 4 weeks again without any strange
occurence.
PPS: Yes, I'm following the hackers list.
P3S: No, I don't believe in a hardware bug.

pgsql-bugs by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: grant/revoke bug with delete/update
Next
From: Tom Lane
Date:
Subject: Re: Strong feeling of something ugly lurking deeply within 7.0 ;-)