Thread: Re: pg_upgrade: can I use same binary for old & new?
On 7/5/25 09:52, Pierre Fortin wrote: > Hi, > > [Hope this gets through after dumping DKIM-ignorant mail provider.] > > Wanting to upgrade from: > PostgreSQL 15.13 on x86_64-mageia-linux-gnu, > compiled by gcc (Mageia 15.1.0-1.mga10) 15.1.0, 64-bit > to: > PG 17.5 > > Way back, I was able to use -k|--link option on pg_upgrade (PG13 to PG15); > but since then: > > - my DB has grown to over 8TB How did you measure above? > - even with ~70TB, I don't have enough contiguous disk space to > dump/restore What was the pg_dump command? > Thanks, > Pierre > > > > > > > -- Adrian Klaver adrian.klaver@aklaver.com
On Sat, Jul 5, 2025 at 2:11 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 7/5/25 09:52, Pierre Fortin wrote:
> Hi,
>
> [Hope this gets through after dumping DKIM-ignorant mail provider.]
>
> Wanting to upgrade from:
> PostgreSQL 15.13 on x86_64-mageia-linux-gnu,
> compiled by gcc (Mageia 15.1.0-1.mga10) 15.1.0, 64-bit
> to:
> PG 17.5
>
> Way back, I was able to use -k|--link option on pg_upgrade (PG13 to PG15);
> but since then:
>
> - my DB has grown to over 8TB
How did you measure above?
> - even with ~70TB, I don't have enough contiguous disk space to
> dump/restore
What was the pg_dump command?
For something that big, he must have been doing an uncompressed plain format dump instead of a directory/custom format dump. Maybe even added --attribute-inserts too.
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote: >On 7/5/25 09:52, Pierre Fortin wrote: >> Wanting to upgrade from: >> PostgreSQL 15.13 on x86_64-mageia-linux-gnu, >> compiled by gcc (Mageia 15.1.0-1.mga10) 15.1.0, 64-bit >> to: >> PG 17.5 >> >> Way back, I was able to use -k|--link option on pg_upgrade (PG13 to PG15); >> but since then: >> >> - my DB has grown to over 8TB > >How did you measure above? # du -sb /var/lib/pgsql/data 8227910662297 /var/lib/pgsql/data >> - even with ~70TB, I don't have enough contiguous disk space to >> dump/restore > >What was the pg_dump command? Didn't try given: $ df /mnt/db Filesystem Size Used Avail Use% Mounted on /dev/sdh1 17T 13T 3.0T 82% /mnt/db I suppose I could dump each of the 1408 objects to various available drives; but given my previous experience with PG13 to PG15 using --link which took seconds; I'm hoping to avoid wasting time (at my age, hours matter). Cheers, Pierre
pf@pfortin.com writes: > On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote: >> How did you measure above? > # du -sb /var/lib/pgsql/data > 8227910662297 /var/lib/pgsql/data It's likely that there's a deal of bloat in that. Even if there's not much bloat, this number will include indexes and WAL data that don't appear in pg_dump output. >> What was the pg_dump command? > Didn't try given: > $ df /mnt/db > Filesystem Size Used Avail Use% Mounted on > /dev/sdh1 17T 13T 3.0T 82% /mnt/db I'd say give it a try; be sure to use one of the pg_dump modes that compress the data. regards, tom lane
On Sat, Jul 5, 2025 at 2:24 PM <pf@pfortin.com> wrote:
On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote:
>On 7/5/25 09:52, Pierre Fortin wrote:
>> Wanting to upgrade from:
>> PostgreSQL 15.13 on x86_64-mageia-linux-gnu,
>> compiled by gcc (Mageia 15.1.0-1.mga10) 15.1.0, 64-bit
>> to:
>> PG 17.5
>>
>> Way back, I was able to use -k|--link option on pg_upgrade (PG13 to PG15);
>> but since then:
>>
>> - my DB has grown to over 8TB
>
>How did you measure above?
# du -sb /var/lib/pgsql/data
8227910662297 /var/lib/pgsql/data
>> - even with ~70TB, I don't have enough contiguous disk space to
>> dump/restore
>
>What was the pg_dump command?
Didn't try given:
$ df /mnt/db
Filesystem Size Used Avail Use% Mounted on
/dev/sdh1 17T 13T 3.0T 82% /mnt/db
I suppose I could dump each of the 1408 objects to various available
drives; but given my previous experience with PG13 to PG15 using --link
which took seconds; I'm hoping to avoid wasting time (at my age, hours
matter).
There's something you're not telling us. The whole point of "pg_upgrade --link" is an in-place upgrade. It might use a few extra GB of disk space for when it backs up the PG15 schema and restores it to PG17. Thus, why are you complaining about running out of disk space?
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On Sat, 05 Jul 2025 14:30:20 -0400 Tom Lane wrote: Forgive my ignorance; always trying to learn more... :) >pf@pfortin.com writes: >> On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote: >>> How did you measure above? > >> # du -sb /var/lib/pgsql/data >> 8227910662297 /var/lib/pgsql/data > >It's likely that there's a deal of bloat in that. Even if there's not >much bloat, this number will include indexes and WAL data that don't >appear in pg_dump output. Does this imply that on restore, I'll have to re-index everything? >>> What was the pg_dump command? > >> Didn't try given: >> $ df /mnt/db >> Filesystem Size Used Avail Use% Mounted on >> /dev/sdh1 17T 13T 3.0T 82% /mnt/db > >I'd say give it a try; be sure to use one of the pg_dump modes >that compress the data. OK... I failed to mention I have several databases in this cluster; so digging into pg_dumpall, I see: --binary-upgrade This option is for use by in-place upgrade utilities. Its use for other purposes is not recommended or supported. The behavior of the option may change in future releases without notice. pg_upgrade has --link option; but I'm puzzled by this option in a dumpall/restore process. My imagination wonders if this alludes to a way to do something like: pg_dumpall --globals-only --roles-only --schema-only ... Would restoring this be a way to update only the control structures? Big assumption that the actual data remains untouched... Inquiring mind... :) Back to my upgrade issue... All my DBs are static (only queries once loaded). Assuming the dumpall file fits on one of my drives: pg_dumpall -f <path>/PG.backup -v appears to be all I need? pg_dump has compression by default; but I don't see compression with dumpall other than for TOAST. Thanks, You guys are awesome! > regards, tom lane
Pierre Fortin <pf@pfortin.com> writes: > OK... I failed to mention I have several databases in this cluster; so > digging into pg_dumpall, I see: > --binary-upgrade > This option is for use by in-place upgrade utilities. Its use for > other purposes is not recommended or supported. The behavior of the > option may change in future releases without notice. That is infrastructure for pg_upgrade to use. Do not try to use it manually; it won't end well. > All my DBs are static (only queries once loaded). Assuming the dumpall > file fits on one of my drives: > pg_dumpall -f <path>/PG.backup -v > appears to be all I need? pg_dump has compression by default; but I don't > see compression with dumpall other than for TOAST. I would try that first before messing with compression. If it doesn't fit, you'll need to do pg_dumpall --globals-only (mainly to capture your role definitions) and then pg_dump each database into a separate compressed file. regards, tom lane
On Sat, Jul 5, 2025 at 3:19 PM Pierre Fortin <pf@pfortin.com> wrote:
On Sat, 05 Jul 2025 14:30:20 -0400 Tom Lane wrote:
Forgive my ignorance; always trying to learn more... :)
>pf@pfortin.com writes:
>> On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote:
>>> How did you measure above?
>
>> # du -sb /var/lib/pgsql/data
>> 8227910662297 /var/lib/pgsql/data
>
>It's likely that there's a deal of bloat in that. Even if there's not
>much bloat, this number will include indexes and WAL data that don't
>appear in pg_dump output.
Does this imply that on restore, I'll have to re-index everything?
>>> What was the pg_dump command?
>
>> Didn't try given:
>> $ df /mnt/db
>> Filesystem Size Used Avail Use% Mounted on
>> /dev/sdh1 17T 13T 3.0T 82% /mnt/db
>
>I'd say give it a try; be sure to use one of the pg_dump modes
>that compress the data.
OK... I failed to mention I have several databases in this cluster; so
digging into pg_dumpall, I see:
--binary-upgrade
This option is for use by in-place upgrade utilities. Its use for
other purposes is not recommended or supported. The behavior of the
option may change in future releases without notice.
pg_upgrade has --link option; but I'm puzzled by this option in a
dumpall/restore process.
It's _not_ part of a dumpall/restore process.
You _either_ run
- pg_upgrade --link
OR
- pg_dumpall --globals-only > globals.sql / psql -f globals.sql
- pg_dump --format=directory / pg_restore --format=directory of db1
- pg_dump --format=directory / pg_restore --format=directory of db2
- pg_dump --format=directory / pg_restore --format=directory of db3
- pg_dump --format=directory / pg_restore --format=directory of etc...
Why not a plain pg_dumpall of the whole instance? Because that would create a GINORMOUS text file which can only be loaded in a single-threaded manner.
My imagination wonders if this alludes to a way
to do something like:
pg_dumpall --globals-only --roles-only --schema-only ...
Would restoring this be a way to update only the control structures? Big
assumption that the actual data remains untouched...
Inquiring mind... :)
Back to my upgrade issue...
All my DBs are static (only queries once loaded). Assuming the dumpall
file fits on one of my drives:
pg_dumpall -f <path>/PG.backup -v
appears to be all I need? pg_dump has compression by default; but I don't
see compression with dumpall other than for TOAST.
Thanks, You guys are awesome!
> regards, tom lane
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
On 7/5/25 11:24, pf@pfortin.com wrote: > On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote: > > Didn't try given: > $ df /mnt/db > Filesystem Size Used Avail Use% Mounted on > /dev/sdh1 17T 13T 3.0T 82% /mnt/db You said you have ~70TB of free space, so where is the other ~63TB? > > I suppose I could dump each of the 1408 objects to various available > drives; but given my previous experience with PG13 to PG15 using --link > which took seconds; I'm hoping to avoid wasting time (at my age, hours > matter). > > Cheers, > Pierre > > -- Adrian Klaver adrian.klaver@aklaver.com
On 7/5/25 12:19, Pierre Fortin wrote: > On Sat, 05 Jul 2025 14:30:20 -0400 Tom Lane wrote: > > Forgive my ignorance; always trying to learn more... :) > >> pf@pfortin.com writes: >>> On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote: >>>> How did you measure above? >> >>> # du -sb /var/lib/pgsql/data >>> 8227910662297 /var/lib/pgsql/data >> >> It's likely that there's a deal of bloat in that. Even if there's not >> much bloat, this number will include indexes and WAL data that don't >> appear in pg_dump output. > > Does this imply that on restore, I'll have to re-index everything? The dump file includes CREATE INDEX commands and per: https://www.postgresql.org/docs/current/sql-createindex.html "Creating an index can interfere with regular operation of a database. Normally PostgreSQL locks the table to be indexed against writes and performs the entire index build with a single scan of the table. Other transactions can still read the table, but if they try to insert, update, or delete rows in the table they will block until the index build is finished. This could have a severe effect if the system is a live production database. Very large tables can take many hours to be indexed, and even for smaller tables, an index build can lock out writers for periods that are unacceptably long for a production system." Which is why pg_restore: https://www.postgresql.org/docs/current/app-pgrestore.html has: "-j number-of-jobs --jobs=number-of-jobs Run the most time-consuming steps of pg_restore — those that load data, create indexes, or create constraints — concurrently, using up to number-of-jobs concurrent sessions. This option can dramatically reduce the time to restore a large database to a server running on a multiprocessor machine. This option is ignored when emitting a script rather than connecting directly to a database server." > >>>> What was the pg_dump command? >> >>> Didn't try given: >>> $ df /mnt/db >>> Filesystem Size Used Avail Use% Mounted on >>> /dev/sdh1 17T 13T 3.0T 82% /mnt/db >> >> I'd say give it a try; be sure to use one of the pg_dump modes >> that compress the data. > > OK... I failed to mention I have several databases in this cluster; so > digging into pg_dumpall, I see: > --binary-upgrade > This option is for use by in-place upgrade utilities. Its use for > other purposes is not recommended or supported. The behavior of the > option may change in future releases without notice. > > pg_upgrade has --link option; but I'm puzzled by this option in a > dumpall/restore process. My imagination wonders if this alludes to a way > to do something like: > pg_dumpall --globals-only --roles-only --schema-only ... > Would restoring this be a way to update only the control structures? Big > assumption that the actual data remains untouched... > > Inquiring mind... :) > > Back to my upgrade issue... > All my DBs are static (only queries once loaded). Assuming the dumpall > file fits on one of my drives: > pg_dumpall -f <path>/PG.backup -v > appears to be all I need? pg_dump has compression by default; but I don't > see compression with dumpall other than for TOAST. > > Thanks, You guys are awesome! > >> regards, tom lane > > -- Adrian Klaver adrian.klaver@aklaver.com
On 7/5/25 12:19, Pierre Fortin wrote: > On Sat, 05 Jul 2025 14:30:20 -0400 Tom Lane wrote: > >> I'd say give it a try; be sure to use one of the pg_dump modes >> that compress the data. > > OK... I failed to mention I have several databases in this cluster; so > digging into pg_dumpall, I see: > --binary-upgrade > This option is for use by in-place upgrade utilities. Its use for > other purposes is not recommended or supported. The behavior of the > option may change in future releases without notice. > > pg_upgrade has --link option; but I'm puzzled by this option in a > dumpall/restore process. My imagination wonders if this alludes to a way > to do something like: > pg_dumpall --globals-only --roles-only --schema-only ... > Would restoring this be a way to update only the control structures? Big > assumption that the actual data remains untouched... > > Inquiring mind... :) > > Back to my upgrade issue... > All my DBs are static (only queries once loaded). Assuming the dumpall > file fits on one of my drives: > pg_dumpall -f <path>/PG.backup -v If you really want to use pg_dumpall and get compression then something like: pg_dumpall -U postgres | gzip > pg_backup.gz Though this will take some time and really is probably better handled using: pg_dumpall -U postgres -g > pg_globals.sql and then: pg_dump -d <some_db> -U -Fc -f <some_db>.out for each database. This will use compression by default. Neither of these options will be as quick as doing pg_upgrade with --link. Though at this point you are boxed in by not being able to run multiple Postgres versions on one machine. > appears to be all I need? pg_dump has compression by default; but I don't > see compression with dumpall other than for TOAST. > > Thanks, You guys are awesome! > >> regards, tom lane > > -- Adrian Klaver adrian.klaver@aklaver.com
On Sat, 5 Jul 2025 12:58:10 -0700 Adrian Klaver wrote: >On 7/5/25 11:24, pf@pfortin.com wrote: >> On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote: >> > >> Didn't try given: >> $ df /mnt/db >> Filesystem Size Used Avail Use% Mounted on >> /dev/sdh1 17T 13T 3.0T 82% /mnt/db > >You said you have ~70TB of free space, so where is the other ~63TB? I never said "free space" with ~70TB; that's the total space on about 8 drives :) The biggest free space I have is 7.6TB which is less than the 8TB DB; but thanks to the responses, I should be able to make this work... Also, I appreciate the clarification re CREATE INDEX (Doh!) and --jobs Best, Pierre