alter column type

Started by Casey Deccioalmost 11 years ago12 messagesgeneral
Jump to latest
#1Casey Deccio
casey@deccio.net

I have a database in which one table references the primary key of
another. The type of the primary key was initially int, but I changed it
to bigint. However, I forgot to update the type of a column that
references it. So, I've initiated "ALTER TABLE foo ALTER COLUMN bar TYPE
bigint", where foo/bar is the table/column referencing the primary key that
is now of type bigint.

However, with 2^31 rows, it is taking a "long" time to write the rows (it's
been 12 hours). Is there a more efficient way to do this? Even if/when
this one finishes, there are other column types that I have to update.
This update effectively locked me out of all access to the data anyway, so
I don't foresee any concern of writes that might affect integrity.

Cheers,
Casey

#2Melvin Davidson
melvin6925@gmail.com
In reply to: Casey Deccio (#1)
Re: alter column type

CAUTION: This is very dangerous and may cause corruption.
*** DO THIS IN A TEST DATABASE FIRST ***

--1. Get the oid for int8 (bigint)
SELECT t.oid
FROM pg_type t
WHERE typname = 'int8';

--2. Get the oid for your table
SELECT c.oid,
c.relname as table,
a.attname ,
a.atttypid,
a.*
FROM pg_class c
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_attribute a ON ( a.attrelid = c.oid )
WHERE c.relname = 'dogs'
AND n.nspname = 'public'
AND a.attname = 'good_watchdog'
AND NOT attisdropped;

BEGIN;

UPDATE pg_attribute a
SET atttypid = <t.oid from 1>
WHERE a.attrelid = <c.oid from 2>
AND attname = <your column to change>;

COMMIT;

On Fri, Jun 5, 2015 at 12:23 PM, Casey Deccio <casey@deccio.net> wrote:

I have a database in which one table references the primary key of
another. The type of the primary key was initially int, but I changed it
to bigint. However, I forgot to update the type of a column that
references it. So, I've initiated "ALTER TABLE foo ALTER COLUMN bar TYPE
bigint", where foo/bar is the table/column referencing the primary key that
is now of type bigint.

However, with 2^31 rows, it is taking a "long" time to write the rows
(it's been 12 hours). Is there a more efficient way to do this? Even
if/when this one finishes, there are other column types that I have to
update. This update effectively locked me out of all access to the data
anyway, so I don't foresee any concern of writes that might affect
integrity.

Cheers,
Casey

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#3Michael Nolan
htfoot@gmail.com
In reply to: Casey Deccio (#1)
Re: alter column type

On Fri, Jun 5, 2015 at 12:23 PM, Casey Deccio <casey@deccio.net> wrote:

I have a database in which one table references the primary key of
another. The type of the primary key was initially int, but I changed it
to bigint. However, I forgot to update the type of a column that
references it. So, I've initiated "ALTER TABLE foo ALTER COLUMN bar TYPE
bigint", where foo/bar is the table/column referencing the primary key that
is now of type bigint.

However, with 2^31 rows, it is taking a "long" time to write the rows
(it's been 12 hours). Is there a more efficient way to do this? Even
if/when this one finishes, there are other column types that I have to
update. This update effectively locked me out of all access to the data
anyway, so I don't foresee any concern of writes that might affect
integrity.

Cheers,
Casey

Probably too late for this time, but in the past when I've needed to
redefine the type for a column, I've made a dump, edited the dump file to
change the type and then renamed the table and reloaded it. That's usually
several orders of magnitude faster.
--
Mike Nolan
nolan@tssi.com

#4Ravi Krishna
sravikrishna3@gmail.com
In reply to: Melvin Davidson (#2)
Re: alter column type

Why is PG even re-writing all rows when the data type is being changed
from smaller (int) to larger (bigint) type, which automatically means
existing data is safe. Like, changing from varchar(30) to varchar(50)
should involve no rewrite of existing rows.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Casey Deccio
casey@deccio.net
In reply to: Michael Nolan (#3)
Re: alter column type

On Fri, Jun 5, 2015 at 2:36 PM, Michael Nolan <htfoot@gmail.com> wrote:

Probably too late for this time, but in the past when I've needed to
redefine the type for a column, I've made a dump, edited the dump file to
change the type and then renamed the table and reloaded it. That's usually
several orders of magnitude faster.

Actually, not too late. My first ALTER is still running, and I still have
four more to go. Sigh. I had thought of this but wasn't sure how it might
compare. Thanks for the data point :)

Casey

#6John R Pierce
pierce@hogranch.com
In reply to: Ravi Krishna (#4)
Re: alter column type

On 6/5/2015 11:37 AM, Ravi Krishna wrote:

Why is PG even re-writing all rows when the data type is being changed
from smaller (int) to larger (bigint) type, which automatically means
existing data is safe. Like, changing from varchar(30) to varchar(50)
should involve no rewrite of existing rows.

int to bigint requires storage change, as all bigints are 64 bit while
all ints are 32 bit. it would be a MESS to try and keep track of a
table that has some int and some bigint storage of a given field.

now, varchar 30 to 50, that I can't answer, are you sure that does a
rewrite? the storage is exactly the same for those.

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Casey Deccio
casey@deccio.net
In reply to: Melvin Davidson (#2)
Re: alter column type

On Fri, Jun 5, 2015 at 2:23 PM, Melvin Davidson <melvin6925@gmail.com>
wrote:

CAUTION: This is very dangerous and may cause corruption.
*** DO THIS IN A TEST DATABASE FIRST ***

--1. Get the oid for int8 (bigint)
SELECT t.oid
FROM pg_type t
WHERE typname = 'int8';

--2. Get the oid for your table
SELECT c.oid,
c.relname as table,
a.attname ,
a.atttypid,
a.*
FROM pg_class c
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_attribute a ON ( a.attrelid = c.oid )
WHERE c.relname = 'dogs'
AND n.nspname = 'public'
AND a.attname = 'good_watchdog'
AND NOT attisdropped;

BEGIN;

UPDATE pg_attribute a
SET atttypid = <t.oid from 1>
WHERE a.attrelid = <c.oid from 2>
AND attname = <your column to change>;

COMMIT;

Thanks for the idea. Since I'm planning to dump the database first anyway
(using Michael's suggestion) I'm thinking I'll try this on the live
database, after I get it safely dumped. It seemed to work on a test
database.

Being unfamiliar with the internals, what's the risk here? If postgres
thinks something is a bigint, but previously stored it as an int, does that
mean it will try to extract data beyond the boundary of some of the (old)
32-bit values and potentially throw off offsets for other values?

Casey

#8Ravi Krishna
sravikrishna3@gmail.com
In reply to: John R Pierce (#6)
Re: alter column type

On 6/5/2015 11:37 AM, Ravi Krishna wrote:

Why is PG even re-writing all rows when the data type is being changed
from smaller (int) to larger (bigint) type, which automatically means
existing data is safe. Like, changing from varchar(30) to varchar(50)
should involve no rewrite of existing rows.

int to bigint requires storage change, as all bigints are 64 bit while all
ints are 32 bit. it would be a MESS to try and keep track of a table
that has some int and some bigint storage of a given field.

now, varchar 30 to 50, that I can't answer, are you sure that does a
rewrite? the storage is exactly the same for those.

Perhaps I was not clear. I don't expect any re-write for a change of
varchar(30) to 50 for the same reason you mentioned above.

Yes it is normal to expect the storage size for bigint to be different
than 32 bit, but then PG uses MVCC. If and when current row gets
updated, MVCC will ensure a new row to be written, which can fix the
data type.

I believe PG adds or drops a col without rewrite because of MVCC. For
eg, I add a new col-T in a table and drop col-S via a single ALTER
TABLE command. I am assuming this is what happens internally:

In the above case PG will simply do a dictionary update of meta
tables. So all new rows will reflect col-T and as and when the old
rows get modified, it too will get updated to the new structure.

If my above understand is correct, why it is not applied in case of
int -> bigint change.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Ravi Krishna
sravikrishna3@gmail.com
In reply to: Ravi Krishna (#8)
Re: alter column type

In the above case PG will simply do a dictionary update of meta
tables. So all new rows will reflect col-T and as and when the old

I will clarify it bit further:

All new rows will have space allocated for col-T and no space
allocated for col-S, while existing dormant rows are left unmodified .

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Casey Deccio (#7)
Re: alter column type

Casey Deccio <casey@deccio.net> writes:

Being unfamiliar with the internals, what's the risk here? If postgres
thinks something is a bigint, but previously stored it as an int, does that
mean it will try to extract data beyond the boundary of some of the (old)
32-bit values and potentially throw off offsets for other values?

Yes. This *will* break your table, spectacularly. The other person who
was opining that it would work has no understanding of the actual storage
layout.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11John R Pierce
pierce@hogranch.com
In reply to: Casey Deccio (#5)
Re: alter column type

On 6/5/2015 11:46 AM, Casey Deccio wrote:

On Fri, Jun 5, 2015 at 2:36 PM, Michael Nolan <htfoot@gmail.com
<mailto:htfoot@gmail.com>> wrote:

Probably too late for this time, but in the past when I've needed
to redefine the type for a column, I've made a dump, edited the
dump file to change the type and then renamed the table and
reloaded it. That's usually several orders of magnitude faster.

Actually, not too late. My first ALTER is still running, and I still
have four more to go. Sigh. I had thought of this but wasn't sure
how it might compare. Thanks for the data point :)

if all 5 alters' were to the same table, you should have combined them
into one alter statement.

--
john r pierce, recycling bits in santa cruz

#12Casey Deccio
casey@deccio.net
In reply to: John R Pierce (#11)
Re: alter column type

On Fri, Jun 5, 2015 at 4:00 PM, John R Pierce <pierce@hogranch.com> wrote:

Actually, not too late. My first ALTER is still running, and I still have
four more to go. Sigh. I had thought of this but wasn't sure how it might
compare. Thanks for the data point :)

if all 5 alters' were to the same table, you should have combined them
into one alter statement

Actually, there are six in four different tables.

But... I didn't know I could combine multiple ALTERs into a single
statement. Thanks for the heads up.

Casey