Re: [HACKERS] SIGSEGV in BRIN autosummarize - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: [HACKERS] SIGSEGV in BRIN autosummarize
Date
Msg-id 20171018164855.GC17895@telsasoft.com
Whole thread Raw
In response to Re: [HACKERS] SIGSEGV in BRIN autosummarize  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: [HACKERS] SIGSEGV in BRIN autosummarize
List pgsql-hackers
On Tue, Oct 17, 2017 at 09:07:40AM -0500, Justin Pryzby wrote:
> On Tue, Oct 17, 2017 at 09:34:24AM -0400, Tom Lane wrote:
> > Justin Pryzby <pryzby@telsasoft.com> writes:
> > > On Tue, Oct 17, 2017 at 12:59:16PM +0200, Alvaro Herrera wrote:
> > >> Anyway, can give this patch a try?
> > 
> > The trick in this sort of situation is to make sure you build binaries
> > that match your existing install in every way except having the added
> > patch, and maybe getting installed into a different directory.
>
> I'm familiar with that process; but, these are PG10 binaries from PGDG for
> centos6 x64.  I had to add symlinks for postgis library, but otherwise seems to
> be working fine (although I didn't preserve as many configure options as your
> message would suggest I should have).

On Tue, Oct 17, 2017 at 12:49:55PM -0400, Tom Lane wrote:
> So what I'm thinking is that you need an error during perform_work_item,
> and/or more than one work_item picked up in the calling loop, to make this
> bug manifest.  You would need to enter perform_work_item in a

..in our case probably due to interruption by LOCK TABLE, yes?

On Tue, Oct 17, 2017 at 12:49:55PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > And I think that's because we're not
> > checking that the namespace OID is a valid value before calling
> > get_namespace_name on it.
> 
> The part of your patch that adds a check on avw_database is clearly
> correct and necessary.  I'm thinking the change you propose in
> perform_work_item is just overcomplicating code that's okay as it
> stands.  We don't need to optimize for the schema-went-away case.

No crashes in ~28hr.  It occurs to me that it's a weaker test due to not
preserving most compilation options.  If I understand, our crash isn't
explained by the avw_database test anyway (?)

Should I make clean and recompile with all non-prefix options and a minimal
patch (avw_database==MyDatabaseId || continue) ?

Or recompile with existing options but no patch to first verify crash occurs
with locally compiled binary?

Justin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Implement table partitioning.
Next
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] SIGSEGV in BRIN autosummarize