Switching Primary Keys to BigInt

Started by Mohamed Wael Khobalatteover 5 years ago17 messagesgeneral

mkhobalatte@grubhub.com

over 5 years ago

Hi all,

We are running 9.6, and we are planning to move some primary keys from int
to bigint because we are approaching the type limit. We understand this
requires some downtime, but we want to know if there are things we can do
to limit it.

Here are our steps, with questions at the end.

ALTER TABLE some_table ADD COLUMN new_id bigint;
/* in batches, we update all the rows to new_id = id */
CREATE UNIQUE INDEX CONCURRENTLY some_table_pkey_new ON some_table(new_id);
/* take the apps down */
BEGIN;
LOCK TABLE some_table;
UPDATE some_table SET new_id = id WHERE new_id IS NULL;
ALTER SEQUENCE some_table_id_seq OWNED BY some_table.new_id;
ALTER TABLE some_table ALTER COLUMN new_id SET DEFAULT
nextval('some_table_id_seq'::regclass);
ALTER TABLE some_table DROP CONSTRAINT some_table_pkey;
ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING
INDEX some_table_pkey_new;
ALTER TABLE some_table DROP COLUMN id;
ALTER TABLE some_table RENAME COLUMN new_id to id;
COMMIT;

We are concerned with this step:

ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING

INDEX some_table_pkey_new;

which requires a table scan. Is there a way to avoid that? Would a not null
constraint on new_id that is created as invalid first, then validated later
help us? I tried on a table with about 50 million records, and I see a drop
from 19 seconds spent on the alter to 8 seconds, which is inconclusive
(both after restarts for cold cache). Is there another way to tell? Or does
PG just have to do a sequential scan?

If the constraint idea works, we would probably need to add a trigger to
update new_id, but that's TBD.

Adrian Klaver

adrian.klaver@aklaver.com

over 5 years ago

In reply to: Mohamed Wael Khobalatte (#1)

Re: Switching Primary Keys to BigInt

On 7/21/20 8:30 AM, Mohamed Wael Khobalatte wrote:

Hi all,

We are running 9.6, and we are planning to move some primary keys from
int to bigint because we are approaching the type limit. We understand
this requires some downtime, but we want to know if there are things we
can do to limit it.

Here are our steps, with questions at the end.

ALTER TABLE some_table ADD COLUMN new_id bigint;
/* in batches, we update all the rows to new_id = id */
CREATE UNIQUE INDEX CONCURRENTLY some_table_pkey_new ON some_table(new_id);
/* take the apps down */
BEGIN;
LOCK TABLE some_table;
UPDATE some_table SET new_id = id WHERE new_id IS NULL;
ALTER SEQUENCE some_table_id_seq OWNED BY some_table.new_id;
ALTER TABLE some_table ALTER COLUMN new_id SET DEFAULT
nextval('some_table_id_seq'::regclass);
ALTER TABLE some_table DROP CONSTRAINT some_table_pkey;
ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING
INDEX some_table_pkey_new;
ALTER TABLE some_table DROP COLUMN id;
ALTER TABLE some_table RENAME COLUMN new_id to id;
COMMIT;

Could you not simplify to something like this:

test_(aklaver)5432> create table change_seq(id serial PRIMARY KEY);
CREATE TABLE
test_(aklaver)5432> \d change_seq
Table "public.change_seq"
Column | Type | Collation | Nullable | Default

--------+---------+-----------+----------+----------------------------------------
id | integer | | not null |
nextval('change_seq_id_seq'::regclass)
Indexes:
"change_seq_pkey" PRIMARY KEY, btree (id)

test_(aklaver)5432> alter table change_seq alter COLUMN id set data type
bigint;
ALTER TABLE
test_(aklaver)5432> \d change_seq
Table "public.change_seq"
Column | Type | Collation | Nullable | Default

--------+--------+-----------+----------+----------------------------------------
id | bigint | | not null |
nextval('change_seq_id_seq'::regclass)
Indexes:
"change_seq_pkey" PRIMARY KEY, btree (id)

test_(aklaver)5432> alter sequence change_seq_id_seq as bigint;
ALTER SEQUENCE

We are concerned with this step:

ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY

USING INDEX some_table_pkey_new;

which requires a table scan. Is there a way to avoid that? Would a not
null constraint on new_id that is created as invalid first, then
validated later help us? I tried on a table with about 50 million
records, and I see a drop from 19 seconds spent on the alter to 8
seconds, which is inconclusive (both after restarts for cold cache). Is
there another way to tell? Or does PG just have to do a sequential scan?

If the constraint idea works, we would probably need to add a trigger to
update new_id, but that's TBD.

--
Adrian Klaver
adrian.klaver@aklaver.com

Adrian Klaver

adrian.klaver@aklaver.com

over 5 years ago

In reply to: Adrian Klaver (#2)

Re: Switching Primary Keys to BigInt

On 7/21/20 11:17 AM, Adrian Klaver wrote:

On 7/21/20 8:30 AM, Mohamed Wael Khobalatte wrote:

Hi all,

test_(aklaver)5432> alter table change_seq alter COLUMN id set data type

bigint;
ALTER TABLE
test_(aklaver)5432> \d change_seq
                            Table "public.change_seq"
Column | Type | Collation | Nullable |                Default
--------+--------+-----------+----------+----------------------------------------

id     | bigint |           | not null |
nextval('change_seq_id_seq'::regclass)
Indexes:
    "change_seq_pkey" PRIMARY KEY, btree (id)

Forgot sequences are bigint by default. It would not hurt to check
pg_sequence just to make sure they are that. In that case the below is
not needed.

test_(aklaver)5432> alter sequence change_seq_id_seq as bigint;
ALTER SEQUENCE

--
Adrian Klaver
adrian.klaver@aklaver.com

Michael Lewis

mlewis@entrata.com

over 5 years ago

In reply to: Mohamed Wael Khobalatte (#1)

Re: Switching Primary Keys to BigInt

Curious- what requires that the unique index be declared a primary key?
What advantage does that give you? Just ensuring it isn't null?

Side note- EOL for 9.6 is coming next year so just a plug for upgrading
when possible, perhaps utilizing pglogical to get to v11 or 12.

Show quoted text

Mohamed Wael Khobalatte

mkhobalatte@grubhub.com

over 5 years ago

In reply to: Michael Lewis (#4)

Re: Switching Primary Keys to BigInt

test_(aklaver)5432> alter table change_seq alter COLUMN id set data

type

bigint;
ALTER TABLE
test_(aklaver)5432> \d change_seq
Table "public.change_seq"
Column | Type | Collation | Nullable | Default
--------+--------+-----------+----------+-------------------

---------------------

id | bigint | | not null |
nextval('change_seq_id_seq'::regclass)
Indexes:
"change_seq_pkey" PRIMARY KEY, btree (id)

This is significant downtime, since it locks exclusively, no? We want to
avoid that.

Side note- EOL for 9.6 is coming next year so just a plug for upgrading

when possible, perhaps utilizing pglogical to get to v11 or 12.

Yep, we are painfully aware. The id growth will beat us to it, so we need
to deal with that first.

Adrian Klaver

adrian.klaver@aklaver.com

over 5 years ago

In reply to: Mohamed Wael Khobalatte (#5)

Re: Switching Primary Keys to BigInt

On 7/21/20 2:18 PM, Mohamed Wael Khobalatte wrote:

> test_(aklaver)5432> alter table change_seq alter COLUMN id set data type
bigint;
ALTER TABLE
test_(aklaver)5432> \d change_seq
                            Table "public.change_seq"
Column | Type | Collation | Nullable |                Default
--------+--------+-----------+----------+----------------------------------------

id     | bigint |           | not null |
nextval('change_seq_id_seq'::regclass)
Indexes:
    "change_seq_pkey" PRIMARY KEY, btree (id)

This is significant downtime, since it locks exclusively, no? We want to
avoid that.

Yeah, I thought the int --> bigint would not do a table rewrite. Testing
showed otherwise. Forget that idea.

Side note- EOL for 9.6 is coming next year so just a plug for

upgrading when possible, perhaps utilizing pglogical to get to v11 or 12.

Yep, we are painfully aware. The id growth will beat us to it, so we
need to deal with that first.

--
Adrian Klaver
adrian.klaver@aklaver.com

Mohamed Wael Khobalatte

mkhobalatte@grubhub.com

over 5 years ago

In reply to: Adrian Klaver (#6)

Re: Switching Primary Keys to BigInt

Yeah, I thought the int --> bigint would not do a table rewrite. Testing
showed otherwise. Forget that idea.

Got it. Not sure what else we should consider. It seemed like the
constraint might be possible, but currently need a far bigger table to be
able to tell for sure, since we can't explain a DDL.

On Tue, Jul 21, 2020 at 7:32 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 7/21/20 2:18 PM, Mohamed Wael Khobalatte wrote:

test_(aklaver)5432> alter table change_seq alter COLUMN id set data

type

bigint;
ALTER TABLE
test_(aklaver)5432> \d change_seq
Table "public.change_seq"
Column | Type | Collation | Nullable | Default

--------+--------+-----------+----------+----------------------------------------

id | bigint | | not null |
nextval('change_seq_id_seq'::regclass)
Indexes:
"change_seq_pkey" PRIMARY KEY, btree (id)

This is significant downtime, since it locks exclusively, no? We want to
avoid that.

Yeah, I thought the int --> bigint would not do a table rewrite. Testing
showed otherwise. Forget that idea.

Side note- EOL for 9.6 is coming next year so just a plug for

upgrading when possible, perhaps utilizing pglogical to get to v11 or 12.

Yep, we are painfully aware. The id growth will beat us to it, so we
need to deal with that first.

--
Adrian Klaver
adrian.klaver@aklaver.com

Daniel Verite

daniel@manitou-mail.org

over 5 years ago

In reply to: Mohamed Wael Khobalatte (#5)

Re: Switching Primary Keys to BigInt

Mohamed Wael Khobalatte wrote:

alter table change_seq alter COLUMN id set data
type bigint;

This is significant downtime, since it locks exclusively, no? We want to
avoid that.

Well, in the steps you mentioned upthread, the transaction starts by
doing LOCK TABLE some_table, so it will hold an exclusive lock on it
for the rest of the transaction.

If you can test how the ALTER TABLE... SET TYPE ... compares
to your procedure in terms of downtime, that would be interesting.
To me, it's not clear why the procedure in multiple steps would
be better overall than a single ALTER TABLE.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite

Mohamed Wael Khobalatte

mkhobalatte@grubhub.com

over 5 years ago

In reply to: Daniel Verite (#8)

Re: Switching Primary Keys to BigInt

On Wed, Jul 22, 2020 at 9:27 AM Daniel Verite <daniel@manitou-mail.org>
wrote:

Mohamed Wael Khobalatte wrote:

alter table change_seq alter COLUMN id set data
type bigint;

This is significant downtime, since it locks exclusively, no? We want to
avoid that.

Well, in the steps you mentioned upthread, the transaction starts by
doing LOCK TABLE some_table, so it will hold an exclusive lock on it
for the rest of the transaction.

If you can test how the ALTER TABLE... SET TYPE ... compares
to your procedure in terms of downtime, that would be interesting.
To me, it's not clear why the procedure in multiple steps would
be better overall than a single ALTER TABLE.

We lock the table as a precaution, with the understanding that we are
undergoing a "small" downtime to finish replacing the int id by the new
bigint. The only slow thing in my procedure is the sequential scan that the
ADD CONSTRAINT does because the column is a primary key. A direct alter
table would be far slower, not to mention space requirements?

#10

Daniel Verite

daniel@manitou-mail.org

over 5 years ago

In reply to: Mohamed Wael Khobalatte (#9)

Re: Switching Primary Keys to BigInt

Mohamed Wael Khobalatte wrote:

We lock the table as a precaution, with the understanding that we are
undergoing a "small" downtime to finish replacing the int id by the new
bigint

Ah, sorry I overlooked that most row updates are done
pre-transaction in a preliminary step:
/* in batches, we update all the rows to new_id = id */

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite

#11

Mohamed Wael Khobalatte

mkhobalatte@grubhub.com

over 5 years ago

In reply to: Daniel Verite (#10)

Re: Switching Primary Keys to BigInt

On Wed, Jul 22, 2020 at 11:13 AM Daniel Verite <daniel@manitou-mail.org>
wrote:

Mohamed Wael Khobalatte wrote:

We lock the table as a precaution, with the understanding that we are
undergoing a "small" downtime to finish replacing the int id by the new
bigint

Ah, sorry I overlooked that most row updates are done
pre-transaction in a preliminary step:
/* in batches, we update all the rows to new_id = id */

No worries. I suppose the answer to the original question, which is how to
avoid a table scan when adding a primary key constraint to a newly
backfilled column is "there is no way"? Downtime might be at least as long
as the table scan.

#12

Michael Lewis

mlewis@entrata.com

over 5 years ago

In reply to: Mohamed Wael Khobalatte (#11)

Re: Switching Primary Keys to BigInt

On Wed, Jul 22, 2020 at 12:23 PM Mohamed Wael Khobalatte <
mkhobalatte@grubhub.com> wrote:

No worries. I suppose the answer to the original question, which is how to
avoid a table scan when adding a primary key constraint to a newly
backfilled column is "there is no way"? Downtime might be at least as long
as the table scan.

One presumes you may be planning to use pglogical or another similar
solution to upgrade to a new Postgres version soon, and would have a
convenient time then to change schema. I am curious, why not just stick
with the single column unique index and forgo for the primary key
constraint for now? If you are concerned about the possibility of a single
null value being inserted, then you could add a not valid check constraint
to enforce that for future rows.

You do you. Obviously testing the primary key on a full replica of the data
with similar hardware and configs will give you a pretty good idea of the
time for that tablescan and adding the constraint in real life. Given your
email domain, I can guess why you would need to absolutely minimize
downtime.

#13

Mohamed Wael Khobalatte

mkhobalatte@grubhub.com

over 5 years ago

In reply to: Michael Lewis (#12)

Re: Switching Primary Keys to BigInt

One presumes you may be planning to use pglogical or another similar
solution to upgrade to a new Postgres version soon, and would have a
convenient time then to change schema. I am curious, why not just stick
with the single column unique index and forgo for the primary key
constraint for now? If you are concerned about the possibility of a single
null value being inserted, then you could add a not valid check constraint
to enforce that for future rows.

Believe it or not I pitched a similar idea internally. I'll explore it in a
test run.

Given your email domain, I can guess why you would need to absolutely
minimize downtime.

Right. ;)

#14

Mohamed Wael Khobalatte

mkhobalatte@grubhub.com

over 5 years ago

In reply to: Mohamed Wael Khobalatte (#13)

Re: Switching Primary Keys to BigInt

Believe it or not I pitched a similar idea internally. I'll explore it in

a test run.

By similar idea, I am referencing your suggestion of dropping the primary
key constraint.

Show quoted text

#15

Ireneusz Pluta

ipluta@wp.pl

over 5 years ago

In reply to: Mohamed Wael Khobalatte (#1)

Re: Switching Primary Keys to BigInt

W dniu 2020-07-21 o 17:30, Mohamed Wael Khobalatte pisze:

we are planning to move some primary keys from int to bigint because we are approaching the type limit

If that does not break your business logic, you might arrange to use the negative half of the ::int
value range. Ugly, but this might at least buy you some time before finding the definite and elegant
way, if you are under some pressure. I do not recommend this, but this is what once saved my life
(or at least one night), after I realized that my PK already reached the limit :-).

#16

Mohamed Wael Khobalatte

mkhobalatte@grubhub.com

over 5 years ago

In reply to: Ireneusz Pluta (#15)

Re: Switching Primary Keys to BigInt

On Wed, Jul 22, 2020 at 4:45 PM Ireneusz Pluta/wp.pl <ipluta@wp.pl> wrote:

W dniu 2020-07-21 o 17:30, Mohamed Wael Khobalatte pisze:

we are planning to move some primary keys from int to bigint because we

are approaching the type limit
If that does not break your business logic, you might arrange to use the
negative half of the ::int
value range. Ugly, but this might at least buy you some time before
finding the definite and elegant
way, if you are under some pressure. I do not recommend this, but this is
what once saved my life
(or at least one night), after I realized that my PK already reached the
limit :-).

Very clever. I think we are set with the current approach. The issue was
more how much downtime, not how fast we are approaching the limit (which is
also a real issue but not of concern in this thread).

#17

Mohamed Wael Khobalatte

mkhobalatte@grubhub.com

over 5 years ago

In reply to: Mohamed Wael Khobalatte (#16)

Re: Switching Primary Keys to BigInt

On Tue, Jul 21, 2020 at 11:30 AM Mohamed Wael Khobalatte <
mkhobalatte@grubhub.com> wrote:

Hi all,

We are running 9.6, and we are planning to move some primary keys from int
to bigint because we are approaching the type limit. We understand this
requires some downtime, but we want to know if there are things we can do
to limit it.

Here are our steps, with questions at the end.

ALTER TABLE some_table ADD COLUMN new_id bigint;
/* in batches, we update all the rows to new_id = id */
CREATE UNIQUE INDEX CONCURRENTLY some_table_pkey_new ON
some_table(new_id);
/* take the apps down */
BEGIN;
LOCK TABLE some_table;
UPDATE some_table SET new_id = id WHERE new_id IS NULL;
ALTER SEQUENCE some_table_id_seq OWNED BY some_table.new_id;
ALTER TABLE some_table ALTER COLUMN new_id SET DEFAULT
nextval('some_table_id_seq'::regclass);
ALTER TABLE some_table DROP CONSTRAINT some_table_pkey;
ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING
INDEX some_table_pkey_new;
ALTER TABLE some_table DROP COLUMN id;
ALTER TABLE some_table RENAME COLUMN new_id to id;
COMMIT;

We are concerned with this step:

ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING

INDEX some_table_pkey_new;

which requires a table scan. Is there a way to avoid that? Would a not
null constraint on new_id that is created as invalid first, then validated
later help us? I tried on a table with about 50 million records, and I see
a drop from 19 seconds spent on the alter to 8 seconds, which is
inconclusive (both after restarts for cold cache). Is there another way to
tell? Or does PG just have to do a sequential scan?

If the constraint idea works, we would probably need to add a trigger to
update new_id, but that's TBD.

The above process I have outlined worked beautifully. Downtime was exactly
what I thought it would be, i.e. equal to a sequential scan of the table in
question (almost down to the second). I am writing this in case someone out
there wants to adopt a similar mechanism.

Thank you all for your valuable inputs.

On Wed, Jul 22, 2020 at 4:52 PM Mohamed Wael Khobalatte <
mkhobalatte@grubhub.com> wrote:

Show quoted text

On Wed, Jul 22, 2020 at 4:45 PM Ireneusz Pluta/wp.pl <ipluta@wp.pl> wrote:

W dniu 2020-07-21 o 17:30, Mohamed Wael Khobalatte pisze:

we are planning to move some primary keys from int to bigint because we

are approaching the type limit
If that does not break your business logic, you might arrange to use the
negative half of the ::int
value range. Ugly, but this might at least buy you some time before
finding the definite and elegant
way, if you are under some pressure. I do not recommend this, but this is
what once saved my life
(or at least one night), after I realized that my PK already reached the
limit :-).

Very clever. I think we are set with the current approach. The issue was
more how much downtime, not how fast we are approaching the limit (which is
also a real issue but not of concern in this thread).