Re: [BUGS] Breakage with VACUUM ANALYSE + partitions - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [BUGS] Breakage with VACUUM ANALYSE + partitions
Date
Msg-id 20160429235837.b63wkjw4dduf56wt@alap3.anarazel.de
Whole thread Raw
In response to Re: [BUGS] Breakage with VACUUM ANALYSE + partitions  (Thom Brown <thom@linux.com>)
Responses Re: [BUGS] Breakage with VACUUM ANALYSE + partitions  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2016-04-28 17:41:29 +0100, Thom Brown wrote:
> I've noticed another breakage, which I can reproduce consistently.

> 2016-04-28 17:36:08 BST [18108]: [47-1] user=,db=,client= DEBUG:  could not
> fsync file "base/24581/24594.1" but retrying: No such file or directory
> 2016-04-28 17:36:08 BST [18108]: [48-1] user=,db=,client= ERROR:  could not
> fsync file "base/24581/24594.1": No such file or directory
> 2016-04-28 17:36:08 BST [18605]: [17-1]
> user=thom,db=postgres,client=[local] ERROR:  checkpoint request failed
> 2016-04-28 17:36:08 BST [18605]: [18-1]
> user=thom,db=postgres,client=[local] HINT:  Consult recent messages in the
> server log for details.

Yuck. md.c is so crummy :(


Basically the reason for the problem is that mdsync() needs to access
"formally non-existant segments" (as in ones where previous segments are
< RELSEG_SIZE), because we queue (and the might be preexistant) fsync
requests via register_dirty_segment() in mdtruncate().

I'm a bit of a loss of how to reconcile that view with the original
issue in this thread.  The best I can come up with this moment is doing
a _mdfd_openseg() in mdsync() to open the truncated segment if
_mdfd_getseg() returned NULL. We don't want to normally use that in
either function because it'll imply a separate open() etc, which is
pretty expensive - but doing in the fallback case would be kind of ok.

Andres



pgsql-hackers by date:

Previous
From: Andreas Karlsson
Date:
Subject: Re: Accidentally parallel unsafe functions
Next
From: Andres Freund
Date:
Subject: Re: atomic pin/unpin causing errors