Re: Something flaky in the "relfilenode mapping" infrastructure - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Something flaky in the "relfilenode mapping" infrastructure
Date
Msg-id 20140613021240.GA719601@tornado.leadboat.com
Whole thread Raw
In response to Re: Something flaky in the "relfilenode mapping" infrastructure  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Something flaky in the "relfilenode mapping" infrastructure  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Jun 12, 2014 at 02:44:10AM -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-06-12 00:38:36 -0400, Noah Misch wrote:
> >> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2014-06-12%2000%3A17%3A07
>
> > Hm. My guess it's that it's just a 'harmless' concurrency issue. The
> > test currently run in concurrency with others: I think what happens is
> > that the table gets dropped in the other relation after the query has
> > acquired the mvcc snapshot (used for the pg_class) test.
> > But why is it triggering on such a 'unusual' system and not on others?
> > That's what worries me a bit.

I can reproduce a similar disturbance in the test query using gdb and a
concurrent table drop, and the table reported in the prairiedog failure is a
table dropped in a concurrent test group.  That explanation may not be the
full story behind these particular failures, but it certainly could cause
similar failures in the future.

Let's prevent this by only reporting rows for relations that still exist after
the query is complete.

> prairiedog is pretty damn slow by modern standards.  OTOH, I think it
> is not the slowest machine in the buildfarm; hamster for instance seems
> to be at least a factor of 2 slower.  So I'm not sure whether to believe
> it's just a timing issue.

That kernel's process scheduler could be a factor.

--
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: How to change the pgsql source code and build it??
Next
From: Tom Lane
Date:
Subject: Re: Something flaky in the "relfilenode mapping" infrastructure