Thread: Effects of dropping a large table
Hi all,
I'm hoping to ensure I understand the implications of dropping a large table and the space being reclaimed by the database and/or OS. We're using pg v14.
This table is quite large with a primary key and one additional index—all together these are on the order of 1TB. The primary key and index take-up quite a bit more space than the table itself. Our hope is to discontinue use of this table and then eventually drop it. However, the database is under constant load and we'd like to avoid (or at least anticipate) downtime or degraded performance. The database also replicates to a standby instance.
So in short, what can we expect if we drop this table? Will the strategy that pg takes to give that space back to the rest of the database and/or OS have significant effects on availability or performance? Finally, are there any other considerations that we should take into account? I appreciate your time and input, thanks!
--
Devin Ivy
On 7/18/23 11:58, Devin Ivy wrote:
You might consider deleting portions of the table in separate (consecutive) batches (maybe 5% per delete). And then truncate table is not logged so that might be an alternative.Hi all,I'm hoping to ensure I understand the implications of dropping a large table and the space being reclaimed by the database and/or OS. We're using pg v14.This table is quite large with a primary key and one additional index—all together these are on the order of 1TB. The primary key and index take-up quite a bit more space than the table itself. Our hope is to discontinue use of this table and then eventually drop it. However, the database is under constant load and we'd like to avoid (or at least anticipate) downtime or degraded performance. The database also replicates to a standby instance.So in short, what can we expect if we drop this table? Will the strategy that pg takes to give that space back to the rest of the database and/or OS have significant effects on availability or performance? Finally, are there any other considerations that we should take into account? I appreciate your time and input, thanks!--Devin Ivy
On Tue, 2023-07-18 at 13:58 -0400, Devin Ivy wrote: > I'm hoping to ensure I understand the implications of dropping a large table and > the space being reclaimed by the database and/or OS. We're using pg v14. > > This table is quite large with a primary key and one additional index—all > together these are on the order of 1TB. The primary key and index take-up > quite a bit more space than the table itself. Our hope is to discontinue > use of this table and then eventually drop it. However, the database is > under constant load and we'd like to avoid (or at least anticipate) downtime > or degraded performance. The database also replicates to a standby instance. > > So in short, what can we expect if we drop this table? Will the strategy > that pg takes to give that space back to the rest of the database and/or > OS have significant effects on availability or performance? Finally, are > there any other considerations that we should take into account? I appreciate > your time and input, thanks! If you drop a table, the underlying files are deleted, and the disk space becomes available. Usually that is a cheap operation, but that of course depends on the file system you are using. On a copy-on-write file system, the space won't be freed right away. The only difficulty that can arise is if you have long running transactions that involve the table. To prevent a hanging DROP TABLE from blocking other transactions for a long time, you can SET lock_timeout = '1s'; DROP TABLE ...; Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Wed, 19 Jul 2023 at 07:41, Rob Sargent <robjsargent@gmail.com> wrote: > You might consider deleting portions of the table in separate (consecutive) batches (maybe 5% per delete). And then truncatetable is not logged so that might be an alternative. Can you explain why this would be a useful thing to do? It sounds to me like it would just create a load of needless WAL from the deletes and the vacuum that cleans up the dead rows each of which is more likely to cause lag problems on the replica servers, which the OP is trying to avoid. David
On 7/19/23 17:15, David Rowley wrote:
No, you're right. I was remembering problems with _deleting_ essentially all of a large table (with limited resources). The drop might not have the same problem. But aren't they logged/transactional and then in the WALs anyway.On Wed, 19 Jul 2023 at 07:41, Rob Sargent <robjsargent@gmail.com> wrote:You might consider deleting portions of the table in separate (consecutive) batches (maybe 5% per delete). And then truncate table is not logged so that might be an alternative.Can you explain why this would be a useful thing to do? It sounds to me like it would just create a load of needless WAL from the deletes and the vacuum that cleans up the dead rows each of which is more likely to cause lag problems on the replica servers, which the OP is trying to avoid. David
On 2023-07-22 16:37:39 -0400, Gus Spier wrote: > Isn’t this a perfect opportunity to use the TRUNCATE command to > quickly remove the data? And follow up by deleting the now empty > tables? What's the advantage of first truncating and then deleting a table over just deleting it? hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
Attachment
Ah! Truncating a table does not entail all of WAL processes. From the documentation, "
TRUNCATE
quickly removes all rows from a set of tables. It has the same effect as an unqualified DELETE
on each table, but since it does not actually scan the tables it is faster. Furthermore, it reclaims disk space immediately, rather than requiring a subsequent VACUUM
operation. This is most useful on large tables." https://www.postgresql.org/docs/14/sql-truncate.htmlRegards,
Gus
On Sun, Jul 23, 2023 at 5:51 AM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
On 2023-07-22 16:37:39 -0400, Gus Spier wrote:
> Isn’t this a perfect opportunity to use the TRUNCATE command to
> quickly remove the data? And follow up by deleting the now empty
> tables?
What's the advantage of first truncating and then deleting a table over
just deleting it?
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
On 2023-07-23 06:09:03 -0400, Gus Spier wrote: > Ah! Truncating a table does not entail all of WAL processes. From the > documentation, "TRUNCATE quickly removes all rows from a set of tables. It has > the same effect as an unqualified DELETE on each table, but since it does not > actually scan the tables it is faster. Furthermore, it reclaims disk space > immediately, rather than requiring a subsequent VACUUM operation. This is most > useful on large tables." https://www.postgresql.org/docs/14/sql-truncate.html I assumed that by "deleting the now empty table" you meant DROPing it. (Performing a «DELETE FROM t» just after a «TRUNCATE t» would obviously be pointless). So let me rephrase the question: What's the advantage of TRUNCATE t DROP t over just DROP t hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
Attachment
On 7/23/23 05:27, Peter J. Holzer wrote: > On 2023-07-23 06:09:03 -0400, Gus Spier wrote: >> Ah! Truncating a table does not entail all of WAL processes. From the >> documentation, "TRUNCATE quickly removes all rows from a set of tables. It has >> the same effect as an unqualified DELETE on each table, but since it does not >> actually scan the tables it is faster. Furthermore, it reclaims disk space >> immediately, rather than requiring a subsequent VACUUM operation. This is most >> useful on large tables." https://www.postgresql.org/docs/14/sql-truncate.html > I assumed that by "deleting the now empty table" you meant DROPing it. > (Performing a «DELETE FROM t» just after a «TRUNCATE t» would obviously > be pointless). > > So let me rephrase the question: > > What's the advantage of > > TRUNCATE t > DROP t > > over just > > DROP t Catalog or serialization locking? (I don't know; just asking.) -- Born in Arizona, moved to Babylonia.