Thread: [GENERAL] Speed of conversion from int to bigint

[GENERAL] Speed of conversion from int to bigint

From
Jonathan Moules
Date:
Hi,
(Postgres 9.5 and 9.6)
We have a table of about 650million rows. It's a partitioned table, with two "child" tables. We want to change its primary key type from int to bigint while retaining the current values.

We're using this:

ALTER TABLE dta.my_table ALTER column table_id TYPE bigint;

But it's taking a very long time, and locking the database. We're going to need to do this in production as well, so a long-term table-lock isn't workable.
Is there anything we can do to speed things up? How long is this likely to take?

Thanks,
Jonathan

Re: [GENERAL] Speed of conversion from int to bigint

From
Tomas Vondra
Date:

On 09/27/2017 10:08 AM, Jonathan Moules wrote:
> Hi,
> (Postgres 9.5 and 9.6)
> We have a table of about 650million rows. It's a partitioned table, with
> two "child" tables. We want to change its primary key type from int to
> bigint while retaining the current values.
> 
> We're using this:
> 
> ALTER TABLE dta.my_table ALTER column table_id TYPE bigint;
> 
> But it's taking a very long time, and locking the database. We're going
> to need to do this in production as well, so a long-term table-lock
> isn't workable.

It's taking very long time, because it does a full-table rewrite while
holding AccessExclusiveLock on it. Which is the strongest lock mode.

> Is there anything we can do to speed things up? How long is this likely
> to take?
> 

What you can do, is roughly this:

---
ALTER TABLE my_table ADD COLUMN new_table_id TYPE bigint;

-- do this in batches, so that a single transaction does not update
-- all the rows
UPDATE my_table SET new_table_id = table_id;

-- build unique index on the new column
CREATE INDEX CONCURRENTLY my_table_2_pkey ON my_table (new_table_id);

BEGIN;
-- switch the PK constraint
ALTER TABLE my_table DROP CONSTRAINT my_table_pkey;
ALTER TABLE my_table ADD CONSTRAINT my_table_pkey PRIMARY KEY
(new_table_id) USING my_table_2_pkey;

-- drop, rename the columns
ALTER TABLE my_table DROP COLUMN table_id;
ALTER TABLE my_table ALTER COLUMN new_table_id RENAME TO table_id;
COMMIT;

You may need to tweak this to handle the inheritance tree.

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Speed of conversion from int to bigint

From
George Neuner
Date:
On Wed, 27 Sep 2017 09:08:25 +0100, Jonathan Moules
<jonathan-lists@lightpear.com> wrote:

>Hi,
>(Postgres 9.5 and 9.6)
>We have a table of about 650million rows. It's a partitioned table,
>with two "child" tables. We want to change its primary key type
>from int to bigint while retaining the current values.
>
>We're using this:
>
>ALTER TABLE dta.my_table ALTER column table_id TYPE bigint;
>
>But it's taking a very long time, and locking the database. We're
>going to need to do this in production as well, so a long-term 
>table-lock isn't workable.

>Is there anything we can do to speed things up? 

Better to create a new table having the right structure and then copy
the data from the original table.

>How long is this likely to take?

Coping 650M rows will be [slightly] faster than altering the structure
of the original table, but it still won't be quick.  

If you need to keep the original in service while copying, one trick
is to add a boolean "copied" column (default false) to the original
table, That will be very quick [no constraints to check].  

Then, in a loop, do something like:
  *** warning - pseudo code ***----
while not done with   batch as    (update <source>      set copied = true      where not copied      limit 10000
returning<columns_to_copy> ) insert into <target>   select *      from batch
 
 if affected rows < 10000   begin transaction serializable     alter table <source> rename to <backup>     alter table
<target>rename to <source>   commit----
 

Rinse and repeat until all the rows have been transferred to the new
table.  When the insert row count drops below the batch size [assuming
no errors have occurred], you know you have copied the last batch.
Then you quickly rename the tables to put the new table into service.

You need to do a final check for and copy of any updates to the
original that might have snuck in while processing the last batch. And
lastly you can drop the source table.

It won't be fast, but it also won't paralyze your database while its
working.


>Thanks,
>Jonathan

Hope this helps.
George



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Speed of conversion from int to bigint

From
George Neuner
Date:
Tomas's suggestion definitely is the better if you're altering the
type of a single column.  If you need to make more extensive changes
to the table structure, copying usually is the better way to go.

George



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Speed of conversion from int to bigint

From
"Jehan-Guillaume (ioguix) de Rorthais"
Date:
On Wed, 27 Sep 2017 11:31:54 +0200
Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

> On 09/27/2017 10:08 AM, Jonathan Moules wrote:
> > Hi,
> > (Postgres 9.5 and 9.6)
> > We have a table of about 650million rows. It's a partitioned table, with
> > two "child" tables. We want to change its primary key type from int to
> > bigint while retaining the current values.
> > 
> > We're using this:
> > 
> > ALTER TABLE dta.my_table ALTER column table_id TYPE bigint;
> > 
> > But it's taking a very long time, and locking the database. We're going
> > to need to do this in production as well, so a long-term table-lock
> > isn't workable.  
> 
> It's taking very long time, because it does a full-table rewrite while
> holding AccessExclusiveLock on it. Which is the strongest lock mode.
> 
> > Is there anything we can do to speed things up? How long is this likely
> > to take?
> >   
> 
> What you can do, is roughly this:
> 
> ---
> ALTER TABLE my_table ADD COLUMN new_table_id TYPE bigint;
> 
> -- do this in batches, so that a single transaction does not update
> -- all the rows
> UPDATE my_table SET new_table_id = table_id;

After or before each UPDATE in your batch, make sure to run a VACUUM on your
table, to keep bloat as low as possible.

Without vacuum, you will probably end up with a table up to twice bigger than
before the maintenance...And you'll have to handle this in another maintenance
period.

> -- build unique index on the new column
> CREATE INDEX CONCURRENTLY my_table_2_pkey ON my_table (new_table_id);

It misses an "UNIQUE" keyword :
 CREATE UNIQUE INDEX CONCURRENTLY my_table_2_pkey ON my_table (new_table_id);


Regards,
-- 
Jehan-Guillaume de Rorthais
Dalibo


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general