RE: pg_upgrade —link does it remove table bloat - Mailing list pgsql-general

From Jason Ralph
Subject RE: pg_upgrade —link does it remove table bloat
Date
Msg-id BN7PR04MB3826BC0C08FFEA1D680440DFD01A0@BN7PR04MB3826.namprd04.prod.outlook.com
Whole thread Raw
In response to Re: pg_upgrade —link does it remove table bloat  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: pg_upgrade —link does it remove table bloat  (Adrian Klaver <adrian.klaver@aklaver.com>)
List pgsql-general
>Well table bloat and table statistics are two different things. Bloat is the accumulation of dead or potentially dead
tupleswhose space has not been marked > as available for reuse by VACUUM or whose space has been returned to the OS
withVACUUM FULL.  For more information see: 

Thanks for the helpful response @Adrian Klaver,
Let me try to rephrase my question,
If a table has bloat before the upgrade, autvacuum was not aggressive enough, once pg_upgrade is complete, the same
tablewill contain the same amount of bloat(dead tuples)?  Meaning its not the same as pg_dump / pg_restore since it’s a
hardlink to the previous data location.  Pg_upgrade with link will not recreate the table. 

Jason Ralph

-----Original Message-----
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, February 13, 2020 10:46 AM
To: Jason Ralph <jralph@affinitysolutions.com>; pgsql-general@lists.postgresql.org
Subject: Re: pg_upgrade —link does it remove table bloat

On 2/13/20 4:31 AM, Jason Ralph wrote:
> When using the pg_upgrade link method to upgrade Postgres a major
> version. Let’s say 9.3 to 11.6 on Centos Linux. Will table bloat carry
> over to the new version. I know using —link will use hard link
> pointers to the new data. So I assume all table bloat will carry over
> to the new version. I also know that pg_upgrade will reset statistics,
> so does the table remain bloated but statistics show otherwise? Can
> Someone please help me answer this? Or link where it’s outlined in the
> manual. Thanks as always.

Well table bloat and table statistics are two different things. Bloat is the accumulation of dead or potentially dead
tupleswhose space has not been marked as available for reuse by VACUUM or whose space has been returned to the OS with
VACUUMFULL.  For more information see: 


https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fdocs%2F12%2Froutine-vacuuming.html&data=01%7C01%7Cjralph%40affinitysolutions.com%7Cf3d88565688a48b9aa2908d7b09bd2b5%7Cfbf1a257f1104fc19456b0ca3038c6f0%7C1&sdata=sjxZiW0rGprUUjZUESwMH4ZHpMjYZ%2BB65t1mbqS0ah4%3D&reserved=0

I would think it would not matter if the files where copied or linked if the space was being held open as result of
regularVACUUM. 

Statistics are just that statistics collected about the distribution of values in the table. For more information see:


https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fdocs%2F12%2Fmonitoring-stats.html&data=01%7C01%7Cjralph%40affinitysolutions.com%7Cf3d88565688a48b9aa2908d7b09bd2b5%7Cfbf1a257f1104fc19456b0ca3038c6f0%7C1&sdata=lJmE9TwHc%2FKbcjJLWOMWlO%2FDGrNSv6SovQoRoJrKbyA%3D&reserved=0


https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fdocs%2F12%2Fplanner-stats-details.html&data=01%7C01%7Cjralph%40affinitysolutions.com%7Cf3d88565688a48b9aa2908d7b09bd2b5%7Cfbf1a257f1104fc19456b0ca3038c6f0%7C1&sdata=mYiAy2XOBfe%2FKSec3OK%2FjniVga9jNNFkjyKDPJBzjVE%3D&reserved=0

They are collected as part of the autovacuum process or by running ANALYZE by itself or with VACUUM.  FYI, pg_upgrade
doesnot automatically update the statistics, it just writes a script that you can then run manually to do that: 


https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fdocs%2F12%2Fpgupgrade.html&data=01%7C01%7Cjralph%40affinitysolutions.com%7Cf3d88565688a48b9aa2908d7b09bd2b5%7Cfbf1a257f1104fc19456b0ca3038c6f0%7C1&sdata=ZDGMLC6QYa%2FCDDc0p7PDukm9Hwr8NYiKBC7DUQ6tI3o%3D&reserved=0

14.Statistics

Because optimizer statistics are not transferred by pg_upgrade, you will be instructed to run a command to regenerate
thatinformation at the end of the upgrade. You might need to set connection parameters to match your new cluster. 

>
> Jason Ralph


--
Adrian Klaver
adrian.klaver@aklaver.com
This message contains confidential information and is intended only for the individual named. If you are not the named
addresseeyou should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if
youhave received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be
guaranteedto be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or
incomplete,or contain viruses. The sender therefore does not accept liability for any errors or omissions in the
contentsof this message, which arise as a result of e-mail transmission. If verification is required please request a
hard-copyversion. 



pgsql-general by date:

Previous
From: Michael Lewis
Date:
Subject: Re: pg_upgrade —link does it remove table bloat
Next
From: Adrian Klaver
Date:
Subject: Re: pg_upgrade —link does it remove table bloat