Re: seg fault crashed the postmaster - Mailing list pgsql-general

From Gordon Shannon
Subject Re: seg fault crashed the postmaster
Date
Msg-id AANLkTik-YyxHu8d_VwBu8bjuWB6NqOKOvGXYrNaZdQoS@mail.gmail.com
Whole thread Raw
In response to Re: seg fault crashed the postmaster  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: seg fault crashed the postmaster  (Gordon Shannon <gordo169@gmail.com>)
List pgsql-general
Interesting. That's exactly what we have been doing -- trying to update the same rows in multiple txns. For us to proceed in production, I will take steps to ensure we stop doing that, as it's just an app bug really.

The table in question -- v_messages -- is an empty base table with 76 partitions, with a total of 2.8 billion rows.
Let me summarize what I see as the key facts here:

(All problems have come from the UPDATE query, all identical except for different "author_id" values.)

1. We did a "link" upgrade Wed night, from 844 to 902 so the upgrade happened in place, no data files were copied. 
2.  The 1st error was "compressed data is corrupt" at 18:16
3. We got 2 seg fault crashes before turning on cores and getting a 3rd crash with the stack trace.
4. We then got a " invalid memory alloc request size 18446744073449177092" at 23:50. This was an ERROR, not a crash.

At this point, is it your suspicion that there is a code bug in 9.0.2, rather than corrupt data?

I will post the schema and then work on a test case.

-gordon

On Fri, Dec 31, 2010 at 8:34 AM, Tom Lane-2 [via PostgreSQL] <[hidden email]> wrote:

Hmm.  This suggests that there's something wrong in the EvalPlanQual
code, which gets invoked when there are concurrent updates to the same
row (ie, the row this UPDATE is trying to change is one that was changed
by some other transaction since the query started).  That stuff got
rewritten rather thoroughly for 9.0, so the idea of a new bug there
isn't exactly surprising.  But it's going to be hard to find without
a test case.  Can you show us the full schema for this table and all
the queries that execute against it up till the point of the failure?
(Turning on log_statement across all sessions would help collect that
info, if you don't have it already.)



View this message in context: Re: seg fault crashed the postmaster
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

pgsql-general by date:

Previous
From: Håvard Wahl Kongsgård
Date:
Subject: Overriding default psql behavior | how to ignore missing fields
Next
From: Gordon Shannon
Date:
Subject: Re: seg fault crashed the postmaster