Re: Errors on missing pg_subtrans/ files with 9.3 - Mailing list pgsql-hackers

From J Smith
Subject Re: Errors on missing pg_subtrans/ files with 9.3
Date
Msg-id CADFUPgeDA2fzX-NeiOFW39uSbvOC9aApv8y0BZJ8FY=1EJTHGQ@mail.gmail.com
Whole thread Raw
In response to Re: Errors on missing pg_subtrans/ files with 9.3  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Errors on missing pg_subtrans/ files with 9.3
List pgsql-hackers
Alright, we'll look into doing that heading into the weekend.
Interestingly, we haven't experienced the issue since our main Java
developer made some modifications to our backend system. I'm not
entirely sure what the changes entail except that it's a one-liner
that involves re-SELECTing a table during a transaction. We'll
rollback this change and re-compile Postgres with google-coredumper
and let it run over the weekend and see where we stand.

Cheers

On Tue, Nov 19, 2013 at 9:14 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Nov 15, 2013 at 4:01 PM, J Smith <dark.panda+lists@gmail.com> wrote:
>> On Fri, Nov 15, 2013 at 3:21 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> I think what would help the most is if you could arrange to obtain a
>>> stack backtrace at the point when the error is thrown.  Maybe put a
>>> long sleep call in just before the error happens, and when it gets
>>> stuck there, attach gdb and run bt full.
>>>
>>
>> That could potentially be doable. Perhaps I could use something like
>> google-coredumper or something similar to have a core dump generated
>> if the error comes up? Part of the problem is that the error is so
>> sporadic that it's going to be tough to say when the next one will
>> occur. For instance, we haven't changed our load on the server, yet
>> the error hasn't occurred since Nov 13, 15:01. I'd also like to avoid
>> blocking on the server with sleep or anything like that unless
>> absolutely necessary, as there are other services we have in
>> development that are using other databases on this cluster. (I can as
>> a matter of last resort, of course, but if google-coredumper can do
>> the job I'd like to give that a shot first.)
>>
>> Any hints on where I could insert something like this? Should I try
>> putting it into the section of elog.c dealing with ENOENT errors, or
>> try to find a spot closer to where the file itself is being opened? I
>> haven't looked at Postgres internals for a while now so I'm not quite
>> sure of the best location for this sort of thing.
>
> I'd look for the specific ereport() call that's firing, and put it
> just before that.
>
> (note that setting the error verbosity to 'verbose' will give you the
> file and line number where the error is happening, which is useful if
> the message can be generated from more than one place)
>
> I'm not familiar with google-coredumper but it sounds like a promising
> technique.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: better atomics - v0.2
Next
From: Robert Haas
Date:
Subject: Re: better atomics - v0.2