Thread: [GENERAL] Speed of conversion from int to bigint
Hi,
(Postgres 9.5 and 9.6)
We have a table of about 650million rows. It's a partitioned table, with two "child" tables. We want to change its primary key type from int to bigint while retaining the current values.
We're using this:
ALTER TABLE dta.my_table ALTER column table_id TYPE bigint;
But it's taking a very long time, and locking the database. We're going to need to do this in production as well, so a long-term table-lock isn't workable.
Is there anything we can do to speed things up? How long is this likely to take?
Thanks,
Jonathan
(Postgres 9.5 and 9.6)
We have a table of about 650million rows. It's a partitioned table, with two "child" tables. We want to change its primary key type from int to bigint while retaining the current values.
We're using this:
ALTER TABLE dta.my_table ALTER column table_id TYPE bigint;
But it's taking a very long time, and locking the database. We're going to need to do this in production as well, so a long-term table-lock isn't workable.
Is there anything we can do to speed things up? How long is this likely to take?
Thanks,
Jonathan
On 09/27/2017 10:08 AM, Jonathan Moules wrote: > Hi, > (Postgres 9.5 and 9.6) > We have a table of about 650million rows. It's a partitioned table, with > two "child" tables. We want to change its primary key type from int to > bigint while retaining the current values. > > We're using this: > > ALTER TABLE dta.my_table ALTER column table_id TYPE bigint; > > But it's taking a very long time, and locking the database. We're going > to need to do this in production as well, so a long-term table-lock > isn't workable. It's taking very long time, because it does a full-table rewrite while holding AccessExclusiveLock on it. Which is the strongest lock mode. > Is there anything we can do to speed things up? How long is this likely > to take? > What you can do, is roughly this: --- ALTER TABLE my_table ADD COLUMN new_table_id TYPE bigint; -- do this in batches, so that a single transaction does not update -- all the rows UPDATE my_table SET new_table_id = table_id; -- build unique index on the new column CREATE INDEX CONCURRENTLY my_table_2_pkey ON my_table (new_table_id); BEGIN; -- switch the PK constraint ALTER TABLE my_table DROP CONSTRAINT my_table_pkey; ALTER TABLE my_table ADD CONSTRAINT my_table_pkey PRIMARY KEY (new_table_id) USING my_table_2_pkey; -- drop, rename the columns ALTER TABLE my_table DROP COLUMN table_id; ALTER TABLE my_table ALTER COLUMN new_table_id RENAME TO table_id; COMMIT; You may need to tweak this to handle the inheritance tree. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Wed, 27 Sep 2017 09:08:25 +0100, Jonathan Moules <jonathan-lists@lightpear.com> wrote: >Hi, >(Postgres 9.5 and 9.6) >We have a table of about 650million rows. It's a partitioned table, >with two "child" tables. We want to change its primary key type >from int to bigint while retaining the current values. > >We're using this: > >ALTER TABLE dta.my_table ALTER column table_id TYPE bigint; > >But it's taking a very long time, and locking the database. We're >going to need to do this in production as well, so a long-term >table-lock isn't workable. >Is there anything we can do to speed things up? Better to create a new table having the right structure and then copy the data from the original table. >How long is this likely to take? Coping 650M rows will be [slightly] faster than altering the structure of the original table, but it still won't be quick. If you need to keep the original in service while copying, one trick is to add a boolean "copied" column (default false) to the original table, That will be very quick [no constraints to check]. Then, in a loop, do something like: *** warning - pseudo code ***---- while not done with batch as (update <source> set copied = true where not copied limit 10000 returning<columns_to_copy> ) insert into <target> select * from batch if affected rows < 10000 begin transaction serializable alter table <source> rename to <backup> alter table <target>rename to <source> commit---- Rinse and repeat until all the rows have been transferred to the new table. When the insert row count drops below the batch size [assuming no errors have occurred], you know you have copied the last batch. Then you quickly rename the tables to put the new table into service. You need to do a final check for and copy of any updates to the original that might have snuck in while processing the last batch. And lastly you can drop the source table. It won't be fast, but it also won't paralyze your database while its working. >Thanks, >Jonathan Hope this helps. George -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Tomas's suggestion definitely is the better if you're altering the type of a single column. If you need to make more extensive changes to the table structure, copying usually is the better way to go. George -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Speed of conversion from int to bigint
From
"Jehan-Guillaume (ioguix) de Rorthais"
Date:
On Wed, 27 Sep 2017 11:31:54 +0200 Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 09/27/2017 10:08 AM, Jonathan Moules wrote: > > Hi, > > (Postgres 9.5 and 9.6) > > We have a table of about 650million rows. It's a partitioned table, with > > two "child" tables. We want to change its primary key type from int to > > bigint while retaining the current values. > > > > We're using this: > > > > ALTER TABLE dta.my_table ALTER column table_id TYPE bigint; > > > > But it's taking a very long time, and locking the database. We're going > > to need to do this in production as well, so a long-term table-lock > > isn't workable. > > It's taking very long time, because it does a full-table rewrite while > holding AccessExclusiveLock on it. Which is the strongest lock mode. > > > Is there anything we can do to speed things up? How long is this likely > > to take? > > > > What you can do, is roughly this: > > --- > ALTER TABLE my_table ADD COLUMN new_table_id TYPE bigint; > > -- do this in batches, so that a single transaction does not update > -- all the rows > UPDATE my_table SET new_table_id = table_id; After or before each UPDATE in your batch, make sure to run a VACUUM on your table, to keep bloat as low as possible. Without vacuum, you will probably end up with a table up to twice bigger than before the maintenance...And you'll have to handle this in another maintenance period. > -- build unique index on the new column > CREATE INDEX CONCURRENTLY my_table_2_pkey ON my_table (new_table_id); It misses an "UNIQUE" keyword : CREATE UNIQUE INDEX CONCURRENTLY my_table_2_pkey ON my_table (new_table_id); Regards, -- Jehan-Guillaume de Rorthais Dalibo -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general