Thread: Re: pg_upgrade: can I use same binary for old & new?

Re: pg_upgrade: can I use same binary for old & new?

From
Adrian Klaver
Date:
On 7/5/25 09:52, Pierre Fortin wrote:
> Hi,
> 
> [Hope this gets through after dumping DKIM-ignorant mail provider.]
> 
> Wanting to upgrade from:
> PostgreSQL 15.13 on x86_64-mageia-linux-gnu,
> compiled by gcc (Mageia 15.1.0-1.mga10) 15.1.0, 64-bit
> to:
> PG 17.5
> 
> Way back, I was able to use -k|--link option on pg_upgrade (PG13 to PG15);
> but since then:
> 
> - my DB has grown to over 8TB

How did you measure above?

> - even with ~70TB, I don't have enough contiguous disk space to
>    dump/restore

What was the pg_dump command?


> Thanks,
> Pierre
> 
> 
> 
> 
> 
> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: pg_upgrade: can I use same binary for old & new?

From
Ron Johnson
Date:
On Sat, Jul 5, 2025 at 2:11 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 7/5/25 09:52, Pierre Fortin wrote:
> Hi,
>
> [Hope this gets through after dumping DKIM-ignorant mail provider.]
>
> Wanting to upgrade from:
> PostgreSQL 15.13 on x86_64-mageia-linux-gnu,
> compiled by gcc (Mageia 15.1.0-1.mga10) 15.1.0, 64-bit
> to:
> PG 17.5
>
> Way back, I was able to use -k|--link option on pg_upgrade (PG13 to PG15);
> but since then:
>
> - my DB has grown to over 8TB

How did you measure above?

> - even with ~70TB, I don't have enough contiguous disk space to
>    dump/restore

What was the pg_dump command?

For something that big, he must have been doing an uncompressed plain format dump instead of a directory/custom format dump.  Maybe even added --attribute-inserts too.

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Re: pg_upgrade: can I use same binary for old & new?

From
pf@pfortin.com
Date:
On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote:

>On 7/5/25 09:52, Pierre Fortin wrote:

>> Wanting to upgrade from:
>> PostgreSQL 15.13 on x86_64-mageia-linux-gnu,
>> compiled by gcc (Mageia 15.1.0-1.mga10) 15.1.0, 64-bit
>> to:
>> PG 17.5
>> 
>> Way back, I was able to use -k|--link option on pg_upgrade (PG13 to PG15);
>> but since then:
>> 
>> - my DB has grown to over 8TB  
>
>How did you measure above?

# du -sb /var/lib/pgsql/data
8227910662297   /var/lib/pgsql/data

>> - even with ~70TB, I don't have enough contiguous disk space to
>>    dump/restore  
>
>What was the pg_dump command?

Didn't try given:
$ df /mnt/db
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdh1        17T   13T  3.0T  82% /mnt/db

I suppose I could dump each of the 1408 objects to various available
drives; but given my previous experience with PG13 to PG15 using --link
which took seconds; I'm hoping to avoid wasting time (at my age, hours
matter).

Cheers,
Pierre



Re: pg_upgrade: can I use same binary for old & new?

From
Tom Lane
Date:
pf@pfortin.com writes:
> On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote:
>> How did you measure above?

> # du -sb /var/lib/pgsql/data
> 8227910662297   /var/lib/pgsql/data

It's likely that there's a deal of bloat in that.  Even if there's not
much bloat, this number will include indexes and WAL data that don't
appear in pg_dump output.

>> What was the pg_dump command?

> Didn't try given:
> $ df /mnt/db
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sdh1        17T   13T  3.0T  82% /mnt/db

I'd say give it a try; be sure to use one of the pg_dump modes
that compress the data.

            regards, tom lane



Re: pg_upgrade: can I use same binary for old & new?

From
Ron Johnson
Date:
On Sat, Jul 5, 2025 at 2:24 PM <pf@pfortin.com> wrote:
On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote:

>On 7/5/25 09:52, Pierre Fortin wrote:

>> Wanting to upgrade from:
>> PostgreSQL 15.13 on x86_64-mageia-linux-gnu,
>> compiled by gcc (Mageia 15.1.0-1.mga10) 15.1.0, 64-bit
>> to:
>> PG 17.5
>>
>> Way back, I was able to use -k|--link option on pg_upgrade (PG13 to PG15);
>> but since then:
>>
>> - my DB has grown to over 8TB 
>
>How did you measure above?

# du -sb /var/lib/pgsql/data
8227910662297   /var/lib/pgsql/data

>> - even with ~70TB, I don't have enough contiguous disk space to
>>    dump/restore 
>
>What was the pg_dump command?

Didn't try given:
$ df /mnt/db
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdh1        17T   13T  3.0T  82% /mnt/db

I suppose I could dump each of the 1408 objects to various available
drives; but given my previous experience with PG13 to PG15 using --link
which took seconds; I'm hoping to avoid wasting time (at my age, hours
matter).

There's something you're not telling us.  The whole point of "pg_upgrade --link" is an in-place upgrade. It might use a few extra GB of disk space for when it backs up the PG15 schema and restores it to PG17.  Thus, why are you complaining about running out of disk space?

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Re: pg_upgrade: can I use same binary for old & new?

From
Pierre Fortin
Date:
On Sat, 05 Jul 2025 14:30:20 -0400 Tom Lane wrote:

Forgive my ignorance; always trying to learn more... :)

>pf@pfortin.com writes:
>> On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote:  
>>> How did you measure above?  
>
>> # du -sb /var/lib/pgsql/data
>> 8227910662297   /var/lib/pgsql/data  
>
>It's likely that there's a deal of bloat in that.  Even if there's not
>much bloat, this number will include indexes and WAL data that don't
>appear in pg_dump output.

Does this imply that on restore, I'll have to re-index everything?

>>> What was the pg_dump command?  
>
>> Didn't try given:
>> $ df /mnt/db
>> Filesystem      Size  Used Avail Use% Mounted on
>> /dev/sdh1        17T   13T  3.0T  82% /mnt/db  
>
>I'd say give it a try; be sure to use one of the pg_dump modes
>that compress the data.

OK...  I failed to mention I have several databases in this cluster; so
digging into pg_dumpall, I see:
   --binary-upgrade
    This option is for use by in-place upgrade utilities. Its use for
    other purposes is not recommended or supported. The behavior of the
    option may change in future releases without notice.

pg_upgrade has --link option; but I'm puzzled by this option in a
dumpall/restore process. My imagination wonders if this alludes to a way
to do something like:
 pg_dumpall --globals-only --roles-only --schema-only ...
Would restoring this be a way to update only the control structures? Big
assumption that the actual data remains untouched...

Inquiring mind...  :)

Back to my upgrade issue...  
All my DBs are static (only queries once loaded). Assuming the dumpall
file fits on one of my drives:
 pg_dumpall -f <path>/PG.backup -v 
appears to be all I need? pg_dump has compression by default; but I don't
see compression with dumpall other than for TOAST. 

Thanks, You guys are awesome!
 
>            regards, tom lane



Re: pg_upgrade: can I use same binary for old & new?

From
Tom Lane
Date:
Pierre Fortin <pf@pfortin.com> writes:
> OK...  I failed to mention I have several databases in this cluster; so
> digging into pg_dumpall, I see:
>    --binary-upgrade
>     This option is for use by in-place upgrade utilities. Its use for
>     other purposes is not recommended or supported. The behavior of the
>     option may change in future releases without notice.

That is infrastructure for pg_upgrade to use.  Do not try to use it
manually; it won't end well.

> All my DBs are static (only queries once loaded). Assuming the dumpall
> file fits on one of my drives:
>  pg_dumpall -f <path>/PG.backup -v 
> appears to be all I need? pg_dump has compression by default; but I don't
> see compression with dumpall other than for TOAST.

I would try that first before messing with compression.  If it doesn't
fit, you'll need to do pg_dumpall --globals-only (mainly to capture
your role definitions) and then pg_dump each database into a separate
compressed file.

            regards, tom lane



Re: pg_upgrade: can I use same binary for old & new?

From
Ron Johnson
Date:
On Sat, Jul 5, 2025 at 3:19 PM Pierre Fortin <pf@pfortin.com> wrote:
On Sat, 05 Jul 2025 14:30:20 -0400 Tom Lane wrote:

Forgive my ignorance; always trying to learn more... :)

>pf@pfortin.com writes:
>> On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote: 
>>> How did you measure above? 
>
>> # du -sb /var/lib/pgsql/data
>> 8227910662297   /var/lib/pgsql/data 
>
>It's likely that there's a deal of bloat in that.  Even if there's not
>much bloat, this number will include indexes and WAL data that don't
>appear in pg_dump output.

Does this imply that on restore, I'll have to re-index everything?

>>> What was the pg_dump command? 
>
>> Didn't try given:
>> $ df /mnt/db
>> Filesystem      Size  Used Avail Use% Mounted on
>> /dev/sdh1        17T   13T  3.0T  82% /mnt/db 
>
>I'd say give it a try; be sure to use one of the pg_dump modes
>that compress the data.

OK...  I failed to mention I have several databases in this cluster; so
digging into pg_dumpall, I see:
   --binary-upgrade
    This option is for use by in-place upgrade utilities. Its use for
    other purposes is not recommended or supported. The behavior of the
    option may change in future releases without notice.

pg_upgrade has --link option; but I'm puzzled by this option in a
dumpall/restore process.

It's _not_ part of a dumpall/restore process.

You _either_ run
- pg_upgrade --link
  OR
- pg_dumpall --globals-only > globals.sql / psql -f globals.sql
- pg_dump --format=directory / pg_restore --format=directory of db1
- pg_dump --format=directory / pg_restore --format=directory of db2
- pg_dump --format=directory / pg_restore --format=directory of db3
- pg_dump --format=directory / pg_restore --format=directory of etc...

Why not a plain pg_dumpall of the whole instance?  Because that would create a GINORMOUS text file which can only be loaded in a single-threaded manner. 
 
My imagination wonders if this alludes to a way
to do something like:
 pg_dumpall --globals-only --roles-only --schema-only ...
Would restoring this be a way to update only the control structures? Big
assumption that the actual data remains untouched...

Inquiring mind...  :)

Back to my upgrade issue... 
All my DBs are static (only queries once loaded). Assuming the dumpall
file fits on one of my drives:
 pg_dumpall -f <path>/PG.backup -v
appears to be all I need? pg_dump has compression by default; but I don't
see compression with dumpall other than for TOAST.

Thanks, You guys are awesome!

>                       regards, tom lane




--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Re: pg_upgrade: can I use same binary for old & new?

From
Adrian Klaver
Date:
On 7/5/25 11:24, pf@pfortin.com wrote:
> On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote:
> 

> Didn't try given:
> $ df /mnt/db
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sdh1        17T   13T  3.0T  82% /mnt/db

You said you have ~70TB of free space, so where is the other  ~63TB?

> 
> I suppose I could dump each of the 1408 objects to various available
> drives; but given my previous experience with PG13 to PG15 using --link
> which took seconds; I'm hoping to avoid wasting time (at my age, hours
> matter).
> 
> Cheers,
> Pierre
> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: pg_upgrade: can I use same binary for old & new?

From
Adrian Klaver
Date:
On 7/5/25 12:19, Pierre Fortin wrote:
> On Sat, 05 Jul 2025 14:30:20 -0400 Tom Lane wrote:
> 
> Forgive my ignorance; always trying to learn more... :)
> 
>> pf@pfortin.com writes:
>>> On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote:
>>>> How did you measure above?
>>
>>> # du -sb /var/lib/pgsql/data
>>> 8227910662297   /var/lib/pgsql/data
>>
>> It's likely that there's a deal of bloat in that.  Even if there's not
>> much bloat, this number will include indexes and WAL data that don't
>> appear in pg_dump output.
> 
> Does this imply that on restore, I'll have to re-index everything?

The dump file includes CREATE INDEX commands and per:

https://www.postgresql.org/docs/current/sql-createindex.html

"Creating an index can interfere with regular operation of a database. 
Normally PostgreSQL locks the table to be indexed against writes and 
performs the entire index build with a single scan of the table. Other 
transactions can still read the table, but if they try to insert, 
update, or delete rows in the table they will block until the index 
build is finished. This could have a severe effect if the system is a 
live production database. Very large tables can take many hours to be 
indexed, and even for smaller tables, an index build can lock out 
writers for periods that are unacceptably long for a production system."

Which is why pg_restore:

https://www.postgresql.org/docs/current/app-pgrestore.html

has:

"-j number-of-jobs
--jobs=number-of-jobs

     Run the most time-consuming steps of pg_restore — those that load 
data, create indexes, or create constraints — concurrently, using up to 
number-of-jobs concurrent sessions. This option can dramatically reduce 
the time to restore a large database to a server running on a 
multiprocessor machine. This option is ignored when emitting a script 
rather than connecting directly to a database server."


> 
>>>> What was the pg_dump command?
>>
>>> Didn't try given:
>>> $ df /mnt/db
>>> Filesystem      Size  Used Avail Use% Mounted on
>>> /dev/sdh1        17T   13T  3.0T  82% /mnt/db
>>
>> I'd say give it a try; be sure to use one of the pg_dump modes
>> that compress the data.
> 
> OK...  I failed to mention I have several databases in this cluster; so
> digging into pg_dumpall, I see:
>     --binary-upgrade
>      This option is for use by in-place upgrade utilities. Its use for
>      other purposes is not recommended or supported. The behavior of the
>      option may change in future releases without notice.
> 
> pg_upgrade has --link option; but I'm puzzled by this option in a
> dumpall/restore process. My imagination wonders if this alludes to a way
> to do something like:
>   pg_dumpall --globals-only --roles-only --schema-only ...
> Would restoring this be a way to update only the control structures? Big
> assumption that the actual data remains untouched...
> 
> Inquiring mind...  :)
> 
> Back to my upgrade issue...
> All my DBs are static (only queries once loaded). Assuming the dumpall
> file fits on one of my drives:
>   pg_dumpall -f <path>/PG.backup -v
> appears to be all I need? pg_dump has compression by default; but I don't
> see compression with dumpall other than for TOAST.
> 
> Thanks, You guys are awesome!
>   
>>             regards, tom lane
> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: pg_upgrade: can I use same binary for old & new?

From
Adrian Klaver
Date:
On 7/5/25 12:19, Pierre Fortin wrote:
> On Sat, 05 Jul 2025 14:30:20 -0400 Tom Lane wrote:
> 

>> I'd say give it a try; be sure to use one of the pg_dump modes
>> that compress the data.
> 
> OK...  I failed to mention I have several databases in this cluster; so
> digging into pg_dumpall, I see:
>     --binary-upgrade
>      This option is for use by in-place upgrade utilities. Its use for
>      other purposes is not recommended or supported. The behavior of the
>      option may change in future releases without notice.
> 
> pg_upgrade has --link option; but I'm puzzled by this option in a
> dumpall/restore process. My imagination wonders if this alludes to a way
> to do something like:
>   pg_dumpall --globals-only --roles-only --schema-only ...
> Would restoring this be a way to update only the control structures? Big
> assumption that the actual data remains untouched...
> 
> Inquiring mind...  :)
> 
> Back to my upgrade issue...
> All my DBs are static (only queries once loaded). Assuming the dumpall
> file fits on one of my drives:
>   pg_dumpall -f <path>/PG.backup -v

If you really want to use pg_dumpall and get compression then something 
like:

pg_dumpall -U postgres  | gzip  >  pg_backup.gz

Though this will take some time and really is probably better handled using:

pg_dumpall -U postgres -g > pg_globals.sql

and then:

pg_dump -d <some_db> -U -Fc -f <some_db>.out

for each database. This will use compression by default.

Neither of these options will be as quick as doing pg_upgrade with 
--link. Though at this point you are boxed in by not being able to run 
multiple Postgres versions on one machine.


> appears to be all I need? pg_dump has compression by default; but I don't
> see compression with dumpall other than for TOAST.
> 
> Thanks, You guys are awesome!
>   
>>             regards, tom lane
> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: pg_upgrade: can I use same binary for old & new?

From
Pierre Fortin
Date:
On Sat, 5 Jul 2025 12:58:10 -0700 Adrian Klaver wrote:

>On 7/5/25 11:24, pf@pfortin.com wrote:
>> On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote:
>>   
>
>> Didn't try given:
>> $ df /mnt/db
>> Filesystem      Size  Used Avail Use% Mounted on
>> /dev/sdh1        17T   13T  3.0T  82% /mnt/db  
>
>You said you have ~70TB of free space, so where is the other  ~63TB?

I never said "free space" with ~70TB; that's the total space on about 8
drives :)
The biggest free space I have is 7.6TB which is less than the 8TB DB;
but thanks to the responses, I should be able to make this work...

Also, I appreciate the clarification re CREATE INDEX (Doh!) and --jobs

Best,
Pierre