Thread: We've been affected by a pg_upgrade bug. What do we do next?

We've been affected by a pg_upgrade bug. What do we do next?

From
Shaun Thomas
Date:
Hey,

We performed an upgrade via pg_upgrade from 9.1 to 9.3 a while back, and I'm almost certain we were bitten by this bug:

http://www.postgresql.org/message-id/20140530121631.GE25431@alap3.anarazel.de

Finding the discussion is nice... but what do we do to fix this? I read through the discussion, and it *seems* we can
deletethe 0000 file and restart since it's only an 8k file and we haven't gone far enough to wrap into a new 0000 file.
Willthat actually work, though? Or is it too late, since something has already requested that invalid transaction? What
dowe do? 

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com




______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: We've been affected by a pg_upgrade bug. What do we do next?

From
Adrian Klaver
Date:
On 07/16/2014 07:08 AM, Shaun Thomas wrote:
> Hey,
>
> We performed an upgrade via pg_upgrade from 9.1 to 9.3 a while back, and I'm almost certain we were bitten by this
bug:
>
> http://www.postgresql.org/message-id/20140530121631.GE25431@alap3.anarazel.de
>
> Finding the discussion is nice... but what do we do to fix this? I read through the discussion, and it *seems* we can
deletethe 0000 file and restart since it's only an 8k file and we haven't gone far enough to wrap into a new 0000 file.
Willthat actually work, though? Or is it too late, since something has already requested that invalid transaction? What
dowe do? 

See here:

https://wiki.postgresql.org/wiki/20140702pg_upgrade_fix

>
> --
> Shaun Thomas
> OptionsHouse | 141 W. Jackson Blvd | Suite 800 | Chicago IL, 60604
> 312-676-8870
> sthomas@optionshouse.com
>
>
>
>
> ______________________________________________
>
> See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: We've been affected by a pg_upgrade bug. What do we do next?

From
Shaun Thomas
Date:
> See here:
>
> https://wiki.postgresql.org/wiki/20140702pg_upgrade_fix

Thank $Deity! We'll schedule this ASAP. Our cluster is whining about this about every 10 seconds, which is kind of a
PITA.

Thanks again!

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: We've been affected by a pg_upgrade bug. What do we do next?

From
Andres Freund
Date:
On 2014-07-16 14:24:12 +0000, Shaun Thomas wrote:
>
> > See here:
> >
> > https://wiki.postgresql.org/wiki/20140702pg_upgrade_fix
>
> Thank $Deity! We'll schedule this ASAP. Our cluster is whining about this about every 10 seconds, which is kind of a
PITA.

What exactly is it whining about?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: We've been affected by a pg_upgrade bug. What do we do next?

From
Shaun Thomas
Date:
This:

23334|4293964274||||53c68b69.5b26|2014-07-16 09:25:45 CDT|ERROR:  could not access status of transaction 7150346
23334|4293964274||||53c68b69.5b26|2014-07-16 09:25:45 CDT|DETAIL:  Could not open file "pg_multixact/offsets/006D": No
suchfile or directory. 

Been doing it every ten seconds since yesterday.

My log line prefix is this:

log_line_prefix = '%p|%x|%u|%d|%r|%c|%t|'

So the user, database, and connection source are all blank. The PID is different every time, too. Not sure what to make
ofthat. 

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: We've been affected by a pg_upgrade bug. What do we do next?

From
Alvaro Herrera
Date:
Shaun Thomas wrote:
> This:
>
> 23334|4293964274||||53c68b69.5b26|2014-07-16 09:25:45 CDT|ERROR:  could not access status of transaction 7150346
> 23334|4293964274||||53c68b69.5b26|2014-07-16 09:25:45 CDT|DETAIL:  Could not open file "pg_multixact/offsets/006D":
Nosuch file or directory. 
>
> Been doing it every ten seconds since yesterday.
>
> My log line prefix is this:
>
> log_line_prefix = '%p|%x|%u|%d|%r|%c|%t|'
>
> So the user, database, and connection source are all blank. The PID is different every time, too. Not sure what to
makeof that. 

It's an autovacuum worker, which is expected.  Just get rid of the 0000
file and all should be well.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: We've been affected by a pg_upgrade bug. What do we do next?

From
Shaun Thomas
Date:
> It's an autovacuum worker, which is expected.  Just get rid of the 0000
> file and all should be well.

That's what I figured, but I didn't want to make assumptions. Does removing the 0000 file require a restart?

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: We've been affected by a pg_upgrade bug. What do we do next?

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Shaun Thomas wrote:
> > This:
> >
> > 23334|4293964274||||53c68b69.5b26|2014-07-16 09:25:45 CDT|ERROR:  could not access status of transaction 7150346
> > 23334|4293964274||||53c68b69.5b26|2014-07-16 09:25:45 CDT|DETAIL:  Could not open file "pg_multixact/offsets/006D":
Nosuch file or directory. 
> >
> > Been doing it every ten seconds since yesterday.
> >
> > My log line prefix is this:
> >
> > log_line_prefix = '%p|%x|%u|%d|%r|%c|%t|'
> >
> > So the user, database, and connection source are all blank. The PID is different every time, too. Not sure what to
makeof that. 
>
> It's an autovacuum worker, which is expected.  Just get rid of the 0000
> file and all should be well.

BTW if you do a hexdump of the 0000 file, it should be all zeroes.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: We've been affected by a pg_upgrade bug. What do we do next?

From
Shaun Thomas
Date:
> BTW if you do a hexdump of the 0000 file, it should be all zeroes.

Yep, that's what I get. :)

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: We've been affected by a pg_upgrade bug. What do we do next?

From
Alvaro Herrera
Date:
Shaun Thomas wrote:
>
> > It's an autovacuum worker, which is expected.  Just get rid of the 0000
> > file and all should be well.
>
> That's what I figured, but I didn't want to make assumptions. Does
> removing the 0000 file require a restart?

Don't think so, but TBH I didn't try.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services