Thread: pg_upgrade cleanup

pg_upgrade cleanup

From

Bruce Momjian

Date:

15 May 2015, 01:57:00

This patch makes pg_upgrade controldata checks more consistent, and adds
a missing check for float8_pass_by_value.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Attachment

pg_upgrade.diff

Re: pg_upgrade cleanup

From

Bruce Momjian

Date:

15 May 2015, 02:06:15

On Thu, May 14, 2015 at 09:56:53PM -0400, Bruce Momjian wrote:
> This patch makes pg_upgrade controldata checks more consistent, and adds
> a missing check for float8_pass_by_value.

Sorry, I should have mentioned I applied this patch to head.  It isn't
significant enough to backpatch.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: pg_upgrade cleanup

From

Noah Misch

Date:

16 May 2015, 16:21:27

On Thu, May 14, 2015 at 10:06:11PM -0400, Bruce Momjian wrote:
> On Thu, May 14, 2015 at 09:56:53PM -0400, Bruce Momjian wrote:
> > This patch makes pg_upgrade controldata checks more consistent, and adds
> > a missing check for float8_pass_by_value.
> 
> Sorry, I should have mentioned I applied this patch to head.  It isn't
> significant enough to backpatch.

A float8_pass_by_value match is unnecessary, and requiring it creates needless
hassle for users.  Switching between USE_FLOAT8_BYVAL binaries and
!USE_FLOAT8_BYVAL binaries requires an initdb to get different values in
pg_type.typbyval and pg_attribute.attbyval.  pg_upgrade's use of pg_dump to
migrate catalog content addresses that fine.  Note that
check_for_isn_and_int8_passing_mismatch() exists because pg_upgrade has
allowed source and destination clusters to differ in USE_FLOAT8_BYVAL.

The rest of this change is opaque to me.  "More consistent" with what?  The
old use of the "tli" variable sure looked like a bug, considering the variable
was never set to anything but zero.  What is the anticipated behavior change?

Re: pg_upgrade cleanup

From

Bruce Momjian

Date:

16 May 2015, 18:49:19

On Sat, May 16, 2015 at 12:21:12PM -0400, Noah Misch wrote:
> On Thu, May 14, 2015 at 10:06:11PM -0400, Bruce Momjian wrote:
> > On Thu, May 14, 2015 at 09:56:53PM -0400, Bruce Momjian wrote:
> > > This patch makes pg_upgrade controldata checks more consistent, and adds
> > > a missing check for float8_pass_by_value.
> > 
> > Sorry, I should have mentioned I applied this patch to head.  It isn't
> > significant enough to backpatch.
> 
> A float8_pass_by_value match is unnecessary, and requiring it creates needless
> hassle for users.  Switching between USE_FLOAT8_BYVAL binaries and
> !USE_FLOAT8_BYVAL binaries requires an initdb to get different values in
> pg_type.typbyval and pg_attribute.attbyval.  pg_upgrade's use of pg_dump to
> migrate catalog content addresses that fine.  Note that
> check_for_isn_and_int8_passing_mismatch() exists because pg_upgrade has
> allowed source and destination clusters to differ in USE_FLOAT8_BYVAL.

What we had was checking for float8_pass_by_value, but did nothing with
it, so I assumed we had lost the check somehow.  I will remove detecting
and checking of that value.  Thanks.

> The rest of this change is opaque to me.  "More consistent" with what?  The
> old use of the "tli" variable sure looked like a bug, considering the variable
> was never set to anything but zero.  What is the anticipated behavior change?

The problem was that the option checking was not in a consistent order,
so there was no easy easy to make sure everything was being processed
properly.  The new ordering is consistent.

I thought the tli was a harmless cleanup but I now see it was passing a
zero timeline to pg_resetxlog.  The only reason that worked was because
pg_resetxlog ignores a timeline that is less than our current one, and
zero was always less than the timeline so pg_resetxlog was making no
timeline change at all.  I will clean that up too in backbranches as it
is to odd.  (I think it was broken by
038f3a05092365eca070bdc588554520dfd5ffb9).

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: pg_upgrade cleanup

From

Bruce Momjian

Date:

16 May 2015, 19:23:14

On Sat, May 16, 2015 at 02:49:08PM -0400, Bruce Momjian wrote:
> On Sat, May 16, 2015 at 12:21:12PM -0400, Noah Misch wrote:
> > On Thu, May 14, 2015 at 10:06:11PM -0400, Bruce Momjian wrote:
> > > On Thu, May 14, 2015 at 09:56:53PM -0400, Bruce Momjian wrote:
> > > > This patch makes pg_upgrade controldata checks more consistent, and adds
> > > > a missing check for float8_pass_by_value.
> > > 
> > > Sorry, I should have mentioned I applied this patch to head.  It isn't
> > > significant enough to backpatch.
> > 
> > A float8_pass_by_value match is unnecessary, and requiring it creates needless
> > hassle for users.  Switching between USE_FLOAT8_BYVAL binaries and
> > !USE_FLOAT8_BYVAL binaries requires an initdb to get different values in
> > pg_type.typbyval and pg_attribute.attbyval.  pg_upgrade's use of pg_dump to
> > migrate catalog content addresses that fine.  Note that
> > check_for_isn_and_int8_passing_mismatch() exists because pg_upgrade has
> > allowed source and destination clusters to differ in USE_FLOAT8_BYVAL.
> 
> What we had was checking for float8_pass_by_value, but did nothing with
> it, so I assumed we had lost the check somehow.  I will remove detecting
> and checking of that value.  Thanks.

Sorry, I mean we didn't do anything with it in controldata.c, but I had
forgotten how we use it for isn, so I just added a C comment that there
is no need to check that it matches.  Thanks.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: pg_upgrade cleanup

From

Bruce Momjian

Date:

18 May 2015, 14:58:48

On Sat, May 16, 2015 at 12:21:12PM -0400, Noah Misch wrote:
> The rest of this change is opaque to me.  "More consistent" with what?  The
> old use of the "tli" variable sure looked like a bug, considering the variable
> was never set to anything but zero.  What is the anticipated behavior change?

The fact you saw the bug helps in another way.  I was confused why we
had not gotten reports about incorrect timeline restoration in previous
versions of pg_upgrade.  It turns out that through 9.2, we always used a
timeline of 1:
"\"%s/pg_resetxlog\" -l 1,%u,%u \"%s\"", new_cluster.bindir,                        ^

In 9.3 we added code to restore the timeline, but the code that read the
9.2 pg_controldata was buggy, so it tried to restore a timeline of 0,
which was ignored because the timeline can't be decreased with
pg_resetxlog.  Only in 9.4 was the timeline properly restored, leading
to the missing history file error.

This confirms that setting the timeline to 1 unconditionally, as I did
on Friday, is the right fix, and I have added a C comment so we will
remember _why_ this has to be the case.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: pg_upgrade cleanup

From

Noah Misch

Date:

22 May 2015, 02:19:18

On Mon, May 18, 2015 at 10:58:42AM -0400, Bruce Momjian wrote:
> On Sat, May 16, 2015 at 12:21:12PM -0400, Noah Misch wrote:
> > The rest of this change is opaque to me.  "More consistent" with what?  The
> > old use of the "tli" variable sure looked like a bug, considering the variable
> > was never set to anything but zero.  What is the anticipated behavior change?
> 
> The fact you saw the bug helps in another way.  I was confused why we
> had not gotten reports about incorrect timeline restoration in previous
> versions of pg_upgrade.  It turns out that through 9.2, we always used a
> timeline of 1:
> 
>     "\"%s/pg_resetxlog\" -l 1,%u,%u \"%s\"", new_cluster.bindir,
>                             ^
> 
> In 9.3 we added code to restore the timeline, but the code that read the
> 9.2 pg_controldata was buggy, so it tried to restore a timeline of 0,
> which was ignored because the timeline can't be decreased with
> pg_resetxlog.  Only in 9.4 was the timeline properly restored, leading
> to the missing history file error.

That clears up several things.  Interesting.