Re: Switching Primary Keys to BigInt - Mailing list pgsql-general

From Mohamed Wael Khobalatte
Subject Re: Switching Primary Keys to BigInt
Date
Msg-id CABZeWdy+9uKiW9DiC4tk=FyO8YpJX-+yOsfFvdWJRB9Egre_2w@mail.gmail.com
Whole thread Raw
In response to Re: Switching Primary Keys to BigInt  (Mohamed Wael Khobalatte <mkhobalatte@grubhub.com>)
List pgsql-general
On Tue, Jul 21, 2020 at 11:30 AM Mohamed Wael Khobalatte <mkhobalatte@grubhub.com> wrote:
Hi all,

We are running 9.6, and we are planning to move some primary keys from int to bigint because we are approaching the type limit. We understand this requires some downtime, but we want to know if there are things we can do to limit it. 

Here are our steps, with questions at the end. 

ALTER TABLE some_table ADD COLUMN new_id bigint; 
/* in batches, we update all the rows to new_id = id */
CREATE UNIQUE INDEX CONCURRENTLY some_table_pkey_new ON some_table(new_id); 
/* take the apps down */
BEGIN;
LOCK TABLE some_table; 
UPDATE some_table SET new_id = id WHERE new_id IS NULL; 
ALTER SEQUENCE some_table_id_seq OWNED BY some_table.new_id; 
ALTER TABLE some_table ALTER COLUMN new_id SET DEFAULT nextval('some_table_id_seq'::regclass); 
ALTER TABLE some_table DROP CONSTRAINT some_table_pkey;
ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING INDEX some_table_pkey_new;
ALTER TABLE some_table DROP COLUMN id; 
ALTER TABLE some_table RENAME COLUMN new_id to id;
COMMIT;

We are concerned with this step:

> ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING INDEX some_table_pkey_new;

which requires a table scan. Is there a way to avoid that? Would a not null constraint on new_id that is created as invalid first, then validated later help us? I tried on a table with about 50 million records, and I see a drop from 19 seconds spent on the alter to 8 seconds, which is inconclusive (both after restarts for cold cache). Is there another way to tell? Or does PG just have to do a sequential scan? 

If the constraint idea works, we would probably need to add a trigger to update new_id, but that's TBD. 

The above process I have outlined worked beautifully. Downtime was exactly what I thought it would be, i.e. equal to a sequential scan of the table in question (almost down to the second). I am writing this in case someone out there wants to adopt a similar mechanism.  

Thank you all for your valuable inputs. 

On Wed, Jul 22, 2020 at 4:52 PM Mohamed Wael Khobalatte <mkhobalatte@grubhub.com> wrote:


On Wed, Jul 22, 2020 at 4:45 PM Ireneusz Pluta/wp.pl <ipluta@wp.pl> wrote:
W dniu 2020-07-21 o 17:30, Mohamed Wael Khobalatte pisze:
> we are planning to move some primary keys from int to bigint because we are approaching the type limit
If that does not break your business logic, you might arrange to use the negative half of the ::int
value range. Ugly, but this might at least buy you some time before finding the definite and elegant
way, if you are under some pressure. I do not recommend this, but this is what once saved my life
(or at least one night), after I realized that my PK already reached the limit :-).

Very clever. I think we are set with the current approach. The issue was more how much downtime, not how fast we are approaching the limit (which is also a real issue but not of concern in this thread).  

pgsql-general by date:

Previous
From: Surya Widyanto
Date:
Subject: Re: [SOLUTION] Slow or Cannot Connect to PostgreSQL Instance Service on Windows 10
Next
From: Susan Hurst
Date:
Subject: shp2pgsql is missing