Thread: Switching Primary Keys to BigInt
Hi all,
We are running 9.6, and we are planning to move some primary keys from int to bigint because we are approaching the type limit. We understand this requires some downtime, but we want to know if there are things we can do to limit it.
Here are our steps, with questions at the end.
We are running 9.6, and we are planning to move some primary keys from int to bigint because we are approaching the type limit. We understand this requires some downtime, but we want to know if there are things we can do to limit it.
Here are our steps, with questions at the end.
ALTER TABLE some_table ADD COLUMN new_id bigint;
/* in batches, we update all the rows to new_id = id */
CREATE UNIQUE INDEX CONCURRENTLY some_table_pkey_new ON some_table(new_id);
/* take the apps down */
BEGIN;
LOCK TABLE some_table;
UPDATE some_table SET new_id = id WHERE new_id IS NULL;
ALTER SEQUENCE some_table_id_seq OWNED BY some_table.new_id;
ALTER TABLE some_table ALTER COLUMN new_id SET DEFAULT nextval('some_table_id_seq'::regclass);
ALTER TABLE some_table DROP CONSTRAINT some_table_pkey;
ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING INDEX some_table_pkey_new;
ALTER TABLE some_table DROP COLUMN id;
ALTER TABLE some_table RENAME COLUMN new_id to id;
COMMIT;
We are concerned with this step:
> ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING INDEX some_table_pkey_new;
which requires a table scan. Is there a way to avoid that? Would a not null constraint on new_id that is created as invalid first, then validated later help us? I tried on a table with about 50 million records, and I see a drop from 19 seconds spent on the alter to 8 seconds, which is inconclusive (both after restarts for cold cache). Is there another way to tell? Or does PG just have to do a sequential scan?
If the constraint idea works, we would probably need to add a trigger to update new_id, but that's TBD.
CREATE UNIQUE INDEX CONCURRENTLY some_table_pkey_new ON some_table(new_id);
/* take the apps down */
BEGIN;
LOCK TABLE some_table;
UPDATE some_table SET new_id = id WHERE new_id IS NULL;
ALTER SEQUENCE some_table_id_seq OWNED BY some_table.new_id;
ALTER TABLE some_table ALTER COLUMN new_id SET DEFAULT nextval('some_table_id_seq'::regclass);
ALTER TABLE some_table DROP CONSTRAINT some_table_pkey;
ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING INDEX some_table_pkey_new;
ALTER TABLE some_table DROP COLUMN id;
ALTER TABLE some_table RENAME COLUMN new_id to id;
COMMIT;
We are concerned with this step:
> ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING INDEX some_table_pkey_new;
which requires a table scan. Is there a way to avoid that? Would a not null constraint on new_id that is created as invalid first, then validated later help us? I tried on a table with about 50 million records, and I see a drop from 19 seconds spent on the alter to 8 seconds, which is inconclusive (both after restarts for cold cache). Is there another way to tell? Or does PG just have to do a sequential scan?
If the constraint idea works, we would probably need to add a trigger to update new_id, but that's TBD.
On 7/21/20 8:30 AM, Mohamed Wael Khobalatte wrote: > Hi all, > > We are running 9.6, and we are planning to move some primary keys from > int to bigint because we are approaching the type limit. We understand > this requires some downtime, but we want to know if there are things we > can do to limit it. > > Here are our steps, with questions at the end. > > ALTER TABLE some_table ADD COLUMN new_id bigint; > /* in batches, we update all the rows to new_id = id */ > CREATE UNIQUE INDEX CONCURRENTLY some_table_pkey_new ON some_table(new_id); > /* take the apps down */ > BEGIN; > LOCK TABLE some_table; > UPDATE some_table SET new_id = id WHERE new_id IS NULL; > ALTER SEQUENCE some_table_id_seq OWNED BY some_table.new_id; > ALTER TABLE some_table ALTER COLUMN new_id SET DEFAULT > nextval('some_table_id_seq'::regclass); > ALTER TABLE some_table DROP CONSTRAINT some_table_pkey; > ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING > INDEX some_table_pkey_new; > ALTER TABLE some_table DROP COLUMN id; > ALTER TABLE some_table RENAME COLUMN new_id to id; > COMMIT; Could you not simplify to something like this: test_(aklaver)5432> create table change_seq(id serial PRIMARY KEY); CREATE TABLE test_(aklaver)5432> \d change_seq Table "public.change_seq" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+---------------------------------------- id | integer | | not null | nextval('change_seq_id_seq'::regclass) Indexes: "change_seq_pkey" PRIMARY KEY, btree (id) test_(aklaver)5432> alter table change_seq alter COLUMN id set data type bigint; ALTER TABLE test_(aklaver)5432> \d change_seq Table "public.change_seq" Column | Type | Collation | Nullable | Default --------+--------+-----------+----------+---------------------------------------- id | bigint | | not null | nextval('change_seq_id_seq'::regclass) Indexes: "change_seq_pkey" PRIMARY KEY, btree (id) test_(aklaver)5432> alter sequence change_seq_id_seq as bigint; ALTER SEQUENCE > > We are concerned with this step: > > > ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY > USING INDEX some_table_pkey_new; > > which requires a table scan. Is there a way to avoid that? Would a not > null constraint on new_id that is created as invalid first, then > validated later help us? I tried on a table with about 50 million > records, and I see a drop from 19 seconds spent on the alter to 8 > seconds, which is inconclusive (both after restarts for cold cache). Is > there another way to tell? Or does PG just have to do a sequential scan? > > If the constraint idea works, we would probably need to add a trigger to > update new_id, but that's TBD. -- Adrian Klaver adrian.klaver@aklaver.com
On 7/21/20 11:17 AM, Adrian Klaver wrote: > On 7/21/20 8:30 AM, Mohamed Wael Khobalatte wrote: >> Hi all, > > test_(aklaver)5432> alter table change_seq alter COLUMN id set data type > bigint; > ALTER TABLE > test_(aklaver)5432> \d change_seq > Table "public.change_seq" > Column | Type | Collation | Nullable | Default > --------+--------+-----------+----------+---------------------------------------- > > id | bigint | | not null | > nextval('change_seq_id_seq'::regclass) > Indexes: > "change_seq_pkey" PRIMARY KEY, btree (id) Forgot sequences are bigint by default. It would not hurt to check pg_sequence just to make sure they are that. In that case the below is not needed. > > test_(aklaver)5432> alter sequence change_seq_id_seq as bigint; > ALTER SEQUENCE > >> -- Adrian Klaver adrian.klaver@aklaver.com
Curious- what requires that the unique index be declared a primary key? What advantage does that give you? Just ensuring it isn't null?
Side note- EOL for 9.6 is coming next year so just a plug for upgrading when possible, perhaps utilizing pglogical to get to v11 or 12.
> > test_(aklaver)5432> alter table change_seq alter COLUMN id set data type
> bigint;
> ALTER TABLE
> test_(aklaver)5432> \d change_seq
> Table "public.change_seq"
> Column | Type | Collation | Nullable | Default
> --------+--------+-----------+----------+----------------------------------------
>
> id | bigint | | not null |
> nextval('change_seq_id_seq'::regclass)
> Indexes:
> "change_seq_pkey" PRIMARY KEY, btree (id)
This is significant downtime, since it locks exclusively, no? We want to avoid that.
> Side note- EOL for 9.6 is coming next year so just a plug for upgrading when possible, perhaps utilizing pglogical to get to v11 or 12.
Yep, we are painfully aware. The id growth will beat us to it, so we need to deal with that first.
> bigint;
> ALTER TABLE
> test_(aklaver)5432> \d change_seq
> Table "public.change_seq"
> Column | Type | Collation | Nullable | Default
> --------+--------+-----------+----------+----------------------------------------
>
> id | bigint | | not null |
> nextval('change_seq_id_seq'::regclass)
> Indexes:
> "change_seq_pkey" PRIMARY KEY, btree (id)
This is significant downtime, since it locks exclusively, no? We want to avoid that.
> Side note- EOL for 9.6 is coming next year so just a plug for upgrading when possible, perhaps utilizing pglogical to get to v11 or 12.
Yep, we are painfully aware. The id growth will beat us to it, so we need to deal with that first.
On 7/21/20 2:18 PM, Mohamed Wael Khobalatte wrote: >> > test_(aklaver)5432> alter table change_seq alter COLUMN id set data type >> bigint; >> ALTER TABLE >> test_(aklaver)5432> \d change_seq >> Table "public.change_seq" >> Column | Type | Collation | Nullable | Default >> --------+--------+-----------+----------+---------------------------------------- >> >> id | bigint | | not null | >> nextval('change_seq_id_seq'::regclass) >> Indexes: >> "change_seq_pkey" PRIMARY KEY, btree (id) > > This is significant downtime, since it locks exclusively, no? We want to > avoid that. Yeah, I thought the int --> bigint would not do a table rewrite. Testing showed otherwise. Forget that idea. > > > Side note- EOL for 9.6 is coming next year so just a plug for > upgrading when possible, perhaps utilizing pglogical to get to v11 or 12. > > Yep, we are painfully aware. The id growth will beat us to it, so we > need to deal with that first. > > -- Adrian Klaver adrian.klaver@aklaver.com
> Yeah, I thought the int --> bigint would not do a table rewrite. Testing
> showed otherwise. Forget that idea.
Got it. Not sure what else we should consider. It seemed like the constraint might be possible, but currently need a far bigger table to be able to tell for sure, since we can't explain a DDL.
> showed otherwise. Forget that idea.
Got it. Not sure what else we should consider. It seemed like the constraint might be possible, but currently need a far bigger table to be able to tell for sure, since we can't explain a DDL.
On Tue, Jul 21, 2020 at 7:32 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 7/21/20 2:18 PM, Mohamed Wael Khobalatte wrote:
>> > test_(aklaver)5432> alter table change_seq alter COLUMN id set data type
>> bigint;
>> ALTER TABLE
>> test_(aklaver)5432> \d change_seq
>> Table "public.change_seq"
>> Column | Type | Collation | Nullable | Default
>> --------+--------+-----------+----------+----------------------------------------
>>
>> id | bigint | | not null |
>> nextval('change_seq_id_seq'::regclass)
>> Indexes:
>> "change_seq_pkey" PRIMARY KEY, btree (id)
>
> This is significant downtime, since it locks exclusively, no? We want to
> avoid that.
Yeah, I thought the int --> bigint would not do a table rewrite. Testing
showed otherwise. Forget that idea.
>
> > Side note- EOL for 9.6 is coming next year so just a plug for
> upgrading when possible, perhaps utilizing pglogical to get to v11 or 12.
>
> Yep, we are painfully aware. The id growth will beat us to it, so we
> need to deal with that first.
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Mohamed Wael Khobalatte wrote: > > alter table change_seq alter COLUMN id set data > > type bigint; > This is significant downtime, since it locks exclusively, no? We want to > avoid that. Well, in the steps you mentioned upthread, the transaction starts by doing LOCK TABLE some_table, so it will hold an exclusive lock on it for the rest of the transaction. If you can test how the ALTER TABLE... SET TYPE ... compares to your procedure in terms of downtime, that would be interesting. To me, it's not clear why the procedure in multiple steps would be better overall than a single ALTER TABLE. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: https://www.manitou-mail.org Twitter: @DanielVerite
On Wed, Jul 22, 2020 at 9:27 AM Daniel Verite <daniel@manitou-mail.org> wrote:
Mohamed Wael Khobalatte wrote:
> > alter table change_seq alter COLUMN id set data
> > type bigint;
> This is significant downtime, since it locks exclusively, no? We want to
> avoid that.
Well, in the steps you mentioned upthread, the transaction starts by
doing LOCK TABLE some_table, so it will hold an exclusive lock on it
for the rest of the transaction.
If you can test how the ALTER TABLE... SET TYPE ... compares
to your procedure in terms of downtime, that would be interesting.
To me, it's not clear why the procedure in multiple steps would
be better overall than a single ALTER TABLE.
We lock the table as a precaution, with the understanding that we are undergoing a "small" downtime to finish replacing the int id by the new bigint. The only slow thing in my procedure is the sequential scan that the ADD CONSTRAINT does because the column is a primary key. A direct alter table would be far slower, not to mention space requirements?
Mohamed Wael Khobalatte wrote: > We lock the table as a precaution, with the understanding that we are > undergoing a "small" downtime to finish replacing the int id by the new > bigint Ah, sorry I overlooked that most row updates are done pre-transaction in a preliminary step: /* in batches, we update all the rows to new_id = id */ Best regards, -- Daniel Vérité PostgreSQL-powered mailer: https://www.manitou-mail.org Twitter: @DanielVerite
On Wed, Jul 22, 2020 at 11:13 AM Daniel Verite <daniel@manitou-mail.org> wrote:
Mohamed Wael Khobalatte wrote:
> We lock the table as a precaution, with the understanding that we are
> undergoing a "small" downtime to finish replacing the int id by the new
> bigint
Ah, sorry I overlooked that most row updates are done
pre-transaction in a preliminary step:
/* in batches, we update all the rows to new_id = id */
No worries. I suppose the answer to the original question, which is how to avoid a table scan when adding a primary key constraint to a newly backfilled column is "there is no way"? Downtime might be at least as long as the table scan.
On Wed, Jul 22, 2020 at 12:23 PM Mohamed Wael Khobalatte <mkhobalatte@grubhub.com> wrote:
No worries. I suppose the answer to the original question, which is how to avoid a table scan when adding a primary key constraint to a newly backfilled column is "there is no way"? Downtime might be at least as long as the table scan.
One presumes you may be planning to use pglogical or another similar solution to upgrade to a new Postgres version soon, and would have a convenient time then to change schema. I am curious, why not just stick with the single column unique index and forgo for the primary key constraint for now? If you are concerned about the possibility of a single null value being inserted, then you could add a not valid check constraint to enforce that for future rows.
You do you. Obviously testing the primary key on a full replica of the data with similar hardware and configs will give you a pretty good idea of the time for that tablescan and adding the constraint in real life. Given your email domain, I can guess why you would need to absolutely minimize downtime.
You do you. Obviously testing the primary key on a full replica of the data with similar hardware and configs will give you a pretty good idea of the time for that tablescan and adding the constraint in real life. Given your email domain, I can guess why you would need to absolutely minimize downtime.
One presumes you may be planning to use pglogical or another similar solution to upgrade to a new Postgres version soon, and would have a convenient time then to change schema. I am curious, why not just stick with the single column unique index and forgo for the primary key constraint for now? If you are concerned about the possibility of a single null value being inserted, then you could add a not valid check constraint to enforce that for future rows.
Believe it or not I pitched a similar idea internally. I'll explore it in a test run.
Given your email domain, I can guess why you would need to absolutely minimize downtime.
Right. ;)
> Believe it or not I pitched a similar idea internally. I'll explore it in a test run.
By similar idea, I am referencing your suggestion of dropping the primary key constraint.
W dniu 2020-07-21 o 17:30, Mohamed Wael Khobalatte pisze: > we are planning to move some primary keys from int to bigint because we are approaching the type limit If that does not break your business logic, you might arrange to use the negative half of the ::int value range. Ugly, but this might at least buy you some time before finding the definite and elegant way, if you are under some pressure. I do not recommend this, but this is what once saved my life (or at least one night), after I realized that my PK already reached the limit :-).
W dniu 2020-07-21 o 17:30, Mohamed Wael Khobalatte pisze:
> we are planning to move some primary keys from int to bigint because we are approaching the type limit
If that does not break your business logic, you might arrange to use the negative half of the ::int
value range. Ugly, but this might at least buy you some time before finding the definite and elegant
way, if you are under some pressure. I do not recommend this, but this is what once saved my life
(or at least one night), after I realized that my PK already reached the limit :-).
Very clever. I think we are set with the current approach. The issue was more how much downtime, not how fast we are approaching the limit (which is also a real issue but not of concern in this thread).
On Tue, Jul 21, 2020 at 11:30 AM Mohamed Wael Khobalatte <mkhobalatte@grubhub.com> wrote:
Hi all,
We are running 9.6, and we are planning to move some primary keys from int to bigint because we are approaching the type limit. We understand this requires some downtime, but we want to know if there are things we can do to limit it.
Here are our steps, with questions at the end.ALTER TABLE some_table ADD COLUMN new_id bigint;/* in batches, we update all the rows to new_id = id */
CREATE UNIQUE INDEX CONCURRENTLY some_table_pkey_new ON some_table(new_id);
/* take the apps down */
BEGIN;
LOCK TABLE some_table;
UPDATE some_table SET new_id = id WHERE new_id IS NULL;
ALTER SEQUENCE some_table_id_seq OWNED BY some_table.new_id;
ALTER TABLE some_table ALTER COLUMN new_id SET DEFAULT nextval('some_table_id_seq'::regclass);
ALTER TABLE some_table DROP CONSTRAINT some_table_pkey;
ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING INDEX some_table_pkey_new;
ALTER TABLE some_table DROP COLUMN id;
ALTER TABLE some_table RENAME COLUMN new_id to id;
COMMIT;
We are concerned with this step:
> ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING INDEX some_table_pkey_new;
which requires a table scan. Is there a way to avoid that? Would a not null constraint on new_id that is created as invalid first, then validated later help us? I tried on a table with about 50 million records, and I see a drop from 19 seconds spent on the alter to 8 seconds, which is inconclusive (both after restarts for cold cache). Is there another way to tell? Or does PG just have to do a sequential scan?
If the constraint idea works, we would probably need to add a trigger to update new_id, but that's TBD.
The above process I have outlined worked beautifully. Downtime was exactly what I thought it would be, i.e. equal to a sequential scan of the table in question (almost down to the second). I am writing this in case someone out there wants to adopt a similar mechanism.
Thank you all for your valuable inputs.
On Wed, Jul 22, 2020 at 4:52 PM Mohamed Wael Khobalatte <mkhobalatte@grubhub.com> wrote:
W dniu 2020-07-21 o 17:30, Mohamed Wael Khobalatte pisze:
> we are planning to move some primary keys from int to bigint because we are approaching the type limit
If that does not break your business logic, you might arrange to use the negative half of the ::int
value range. Ugly, but this might at least buy you some time before finding the definite and elegant
way, if you are under some pressure. I do not recommend this, but this is what once saved my life
(or at least one night), after I realized that my PK already reached the limit :-).
Very clever. I think we are set with the current approach. The issue was more how much downtime, not how fast we are approaching the limit (which is also a real issue but not of concern in this thread).