Re: could not truncate directory "pg_subtrans": apparent wraparound - Mailing list pgsql-hackers
From | Dan Langille |
---|---|
Subject | Re: could not truncate directory "pg_subtrans": apparent wraparound |
Date | |
Msg-id | CAPG9OKf=8cXfjsu-2g=mk46yvwRXT11dkDn16GsSjVKODLqCnw@mail.gmail.com Whole thread Raw |
In response to | Re: could not truncate directory "pg_subtrans": apparent wraparound (Thomas Munro <thomas.munro@enterprisedb.com>) |
List | pgsql-hackers |
<div dir="ltr">If there's anything I can try on my servers to help diagnose the issues, please let me know. If desired,I can arrange access for debugging.<br /></div><div class="gmail_extra"><br /><div class="gmail_quote">On Sat, Jun6, 2015 at 12:51 AM, Thomas Munro <span dir="ltr"><<a href="mailto:thomas.munro@enterprisedb.com" target="_blank">thomas.munro@enterprisedb.com</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0 00 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Sat, Jun 6, 2015 at 1:25 PM, Alvaro Herrera <<ahref="mailto:alvherre@2ndquadrant.com">alvherre@2ndquadrant.com</a>> wrote:<br /> > Thomas Munro wrote:<br />><br /> >> My idea was that if I could get oldestXact == next XID in<br /> >> TruncateSUBSTRANS, then TransactionIdToPage(oldestXact)for a value of<br /> >> oldestXact that happens to be immediately after a page boundary(so<br /> >> that xid % 2048 == 0) might give page number that is >=<br /> >> latest_page_number,causing SimpleLruTruncate to print that message.<br /> >> But I can't figure out how to get nextXID == oldest XID, because<br /> >> vacuumdb --freeze --all consumes xids itself, so in my first attempt<br />>> at this, next XID is always 3 ahead of the oldest XID when a<br /> >> checkpoint is run.<br /> ><br />> vacuumdb starts by querying pg_database, which eats one XID.<br /> ><br /> > Vacuum itself only uses one XIDwhen vac_truncate_clog() is called.<br /> > This is called from vac_update_datfrozenxid(), which always happen at<br/> > the end of each user-invoked VACUUM (so three times for vacuumdb if you<br /> > have three databases); autovacuumdoes it also at the end of each run.<br /> > Maybe you can get autovacuum to quit before doing it.<br /> ><br/> > OTOH, if the values in the pg_database entry do not change,<br /> > vac_truncate_clog is not called, andthus vacuum would finish without<br /> > consuming an XID.<br /><br /></span>I have manage to reproduce it a few timesbut haven't quite found the<br /> right synchronisation hacks to make it reliable so I'm not posting a<br /> repro scriptyet.<br /><br /> I think it's a scary sounding message but very rare and entirely<br /> harmless (unless you reallyhave wrapped around...). The fix is<br /> probably something like: if oldest XID == next XID, then just don't<br />call SimpleLruTruncate (truncation is deferred until the next<br /> checkpoint), or perhaps (if we can confirm this doesn'tcause problems<br /> for dirty pages or that there can't be any dirty pages before cutoff<br /> page because of thepreceding flush (as I suspect)) we could use<br /> cutoffPage = TransactionIdToPage(oldextXact - 1) if oldest == next,or<br /> maybe even always.<br /><div class="HOEnZb"><div class="h5"><br /> --<br /> Thomas Munro<br /><a href="http://www.enterprisedb.com"target="_blank">http://www.enterprisedb.com</a><br /></div></div></blockquote></div><br/></div>
pgsql-hackers by date: