Thread: Determine potential change in table size after a column dropped?
I have a large large large table with many many many rows, and it's a certain size in pg_relation_size -- there's a timestamp with tz column on this table that's mostly kind of useless, and I want to figure out how much space it would free if we just dropped it. Can I easily do this?
On Sat, Jan 22, 2022, 12:47 PM Wells Oliver <wells.oliver@gmail.com> wrote:I have a large large large table with many many many rows, and it's a certain size in pg_relation_size -- there's a timestamp with tz column on this table that's mostly kind of useless, and I want to figure out how much space it would free if we just dropped it. Can I easily do this?The DROP COLUMN form does not physically remove the column, but simply makes it invisible to SQL operations. Subsequent insert and update operations in the table will store a null value for the column. Thus, dropping a column is quick but it will not immediately reduce the on-disk size of your table, as the space occupied by the dropped column is not reclaimed. The space will be reclaimed over time as existing rows are updated.To force immediate reclamation of space occupied by a dropped column, you can execute one of the forms of ALTER TABLE that performs a rewrite of the whole table. This results in reconstructing each row with the dropped column replaced by a null value.
What about VACUUM FULL?
Angular momentum makes the world go 'round.
On 1/22/22 1:43 AM, Vijaykumar Jain wrote:On Sat, Jan 22, 2022, 12:47 PM Wells Oliver <wells.oliver@gmail.com> wrote:I have a large large large table with many many many rows, and it's a certain size in pg_relation_size -- there's a timestamp with tz column on this table that's mostly kind of useless, and I want to figure out how much space it would free if we just dropped it. Can I easily do this?The DROP COLUMN form does not physically remove the column, but simply makes it invisible to SQL operations. Subsequent insert and update operations in the table will store a null value for the column. Thus, dropping a column is quick but it will not immediately reduce the on-disk size of your table, as the space occupied by the dropped column is not reclaimed. The space will be reclaimed over time as existing rows are updated.To force immediate reclamation of space occupied by a dropped column, you can execute one of the forms of ALTER TABLE that performs a rewrite of the whole table. This results in reconstructing each row with the dropped column replaced by a null value.
What about VACUUM FULL?--
Angular momentum makes the world go 'round.
wells.oliver@gmail.com
I need only drop the column and VACUUM FULL the table, and not the entire DB, right?
On Sat, 2022-01-22 at 09:08 -0800, Wells Oliver wrote: > I need only drop the column and VACUUM FULL the table, and not the entire DB, right? Not that VACUUM (FULL) will *not* physically get rid of a dropped column, as it just copies the complete rows to a new table. You would need something like: CREATE TABLE newtab (LIKE oldtab); INSERT INTO newtab SELECT * FROM oldtab; Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Laurenz Albe schrieb am 24.01.2022 um 09:28: > On Sat, 2022-01-22 at 09:08 -0800, Wells Oliver wrote: >> I need only drop the column and VACUUM FULL the table, and not the entire DB, right? > > Not that VACUUM (FULL) will *not* physically get rid of a dropped column, > as it just copies the complete rows to a new table. I always wondered why that is the case. If the table is rewritten entirely, wouldn't that also be an option to really get rid of dropped columns? Is there a technical reason, or just a case of "no one cared enough to change it"? Regards Thomas
On Sat, 2022-01-22 at 09:08 -0800, Wells Oliver wrote:
> I need only drop the column and VACUUM FULL the table, and not the entire DB, right?
Not that VACUUM (FULL) will *not* physically get rid of a dropped column,
as it just copies the complete rows to a new table.
You would need something like:
CREATE TABLE newtab (LIKE oldtab);
INSERT INTO newtab SELECT * FROM oldtab;
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
wells.oliver@gmail.com
On Mon, 2022-01-24 at 08:08 -0800, Wells Oliver wrote: > > > I need only drop the column and VACUUM FULL the table, and not the entire DB, right? > > > > Not that VACUUM (FULL) will *not* physically get rid of a dropped column, > > as it just copies the complete rows to a new table. > > > > You would need something like: > > > > CREATE TABLE newtab (LIKE oldtab); > > INSERT INTO newtab SELECT * FROM oldtab; > > So, there's really no way to reclaim space from a dropped column other than > entirely creating a new table? Correct, as far as I know. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Hi All Supermen Experts,I'm new in pgsql and have a similar problem for a timescale pgDB. A DB table is for storing raw sessions data received through IoT network from many remote machines. The data format is the same for all the machines but the sessions lasting-periods could be different from 1 minute to 1 hour and such. Each machine could be activated once a day or a few times a day randomly.My question is:1. How to setup a watch-dog to detect new data has been added into the DB, and2. How to pick-up the newly completed sessions data since last pick-up and put it into a buffer table dedicated to new data for further ETL processing?If you have some scripts in pgSQL, Python or C, it will be greatly appreciated!Thank you.Best regards, Ji
In order to get notified should new rows arrive (or current ones updated or deleted), you can install a trigger which fires a NOTIFY command on a name (channel).
All other sessions which have issued a LISTEN on the same name (channel) will receive a notification.
Unfortunately, not all languages and drivers support this.
Recently, I updated the code for pg_listen in the script language Tcl. It's committed, but no new version released yet.
-- Holger Jakobs, Bergisch Gladbach, Tel. +49-178-9759012
Attachment
Am 01.02.22 um 14:46 schrieb Jiankang Ji:Hi All Supermen Experts,I'm new in pgsql and have a similar problem for a timescale pgDB. A DB table is for storing raw sessions data received through IoT network from many remote machines. The data format is the same for all the machines but the sessions lasting-periods could be different from 1 minute to 1 hour and such. Each machine could be activated once a day or a few times a day randomly.My question is:1. How to setup a watch-dog to detect new data has been added into the DB, and2. How to pick-up the newly completed sessions data since last pick-up and put it into a buffer table dedicated to new data for further ETL processing?If you have some scripts in pgSQL, Python or C, it will be greatly appreciated!Thank you.Best regards, JiIn order to get notified should new rows arrive (or current ones updated or deleted), you can install a trigger which fires a NOTIFY command on a name (channel).
All other sessions which have issued a LISTEN on the same name (channel) will receive a notification.
Unfortunately, not all languages and drivers support this.
Recently, I updated the code for pg_listen in the script language Tcl. It's committed, but no new version released yet.
-- Holger Jakobs, Bergisch Gladbach, Tel. +49-178-9759012
Hi All Supermen Experts,I'm new in pgsql and have a similar problem for a timescale pgDB. A DB table is for storing raw sessions data received through IoT network from many remote machines. The data format is the same for all the machines but the sessions lasting-periods could be different from 1 minute to 1 hour and such. Each machine could be activated once a day or a few times a day randomly.My question is:1. How to setup a watch-dog to detect new data has been added into the DB, and2. How to pick-up the newly completed sessions data since last pick-up and put it into a buffer table dedicated to new data for further ETL processing?If you have some scripts in pgSQL, Python or C, it will be greatly appreciated!Thank you.Best regards, Ji
In order to get notified should new rows arrive (or current ones updated or deleted), you can install a trigger which fires a NOTIFY command on a name (channel).
All other sessions which have issued a LISTEN on the same name (channel) will receive a notification.
Unfortunately, not all languages and drivers support this.
Recently, I updated the code for pg_listen in the script language Tcl. It's committed, but no new version released yet.
-- Holger Jakobs, Bergisch Gladbach, Tel. +49-178-9759012