Cases where alter table set type varchar(longer length) still needs table rewrite

Started by Jeremy Finzelabout 6 years ago7 messagesgeneral
Jump to latest
#1Jeremy Finzel
finzelj@gmail.com

Good morning!

We are a little bit puzzled because running the following command on a 9.6
cluster is apparently requiring a table rewrite, or at least a very long
operation of some kind, even though the docs say that as of 9.2:

-

Increasing the length limit for a varchar or varbit column, or removing
the limit altogether, no longer requires a table rewrite. Similarly,
increasing the allowable precision of a numeric column, or changing a
column from constrained numeric to unconstrained numeric, no longer
requires a table rewrite. Table rewrites are also avoided in similar cases
involving the interval, timestamp, and timestamptz types.

I have a table foo with 100 million rows, and a column:

- id character varying(20)

The following command is the one that we expect to execute very quickly (we
are not seeing any locking), but it is instead taking a very long time:

- ALTER TABLE foo ALTER COLUMN id TYPE varchar(100);

I attempted the same operation instead cast to text - same problem.

Why do the docs seem wrong in our case? I have a guess: if the table was
created prior to version 9.2, perhaps they are not binary coercible to text
after 9.2? In any case, I would be very grateful for an explanation!

Thank you!
Jeremy

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeremy Finzel (#1)
Re: Cases where alter table set type varchar(longer length) still needs table rewrite

Jeremy Finzel <finzelj@gmail.com> writes:

I have a table foo with 100 million rows, and a column:
- id character varying(20)
The following command is the one that we expect to execute very quickly (we
are not seeing any locking), but it is instead taking a very long time:
- ALTER TABLE foo ALTER COLUMN id TYPE varchar(100);

Hm, the code is supposed to avoid a table rewrite, but I wonder if
there's something else that's not being avoided, such as an index
rebuild or foreign-key verification. Could we see the whole table
definition, eg from psql \d+ ?

regards, tom lane

#3Jeremy Finzel
finzelj@gmail.com
In reply to: Tom Lane (#2)
Re: Cases where alter table set type varchar(longer length) still needs table rewrite

On Mon, Feb 17, 2020 at 8:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jeremy Finzel <finzelj@gmail.com> writes:

I have a table foo with 100 million rows, and a column:
- id character varying(20)
The following command is the one that we expect to execute very quickly

(we

are not seeing any locking), but it is instead taking a very long time:
- ALTER TABLE foo ALTER COLUMN id TYPE varchar(100);

Hm, the code is supposed to avoid a table rewrite, but I wonder if
there's something else that's not being avoided, such as an index
rebuild or foreign-key verification. Could we see the whole table
definition, eg from psql \d+ ?

regards, tom lane

Based on your feedback, I quickly identified that indeed, the following
index is causing the re-type to be slow:

"id_idx" btree ("substring"(id::text, 4, 7))

I'm still not sure why a rebuild of this index would be required,
technically speaking. But perhaps in any case the docs should have
something to the effect that expression indexes may require rebuild under
specific circumstances?

Thanks!
Jeremy

#4Ron
ronljohnsonjr@gmail.com
In reply to: Jeremy Finzel (#3)
Re: Cases where alter table set type varchar(longer length) still needs table rewrite

On 2/17/20 9:01 AM, Jeremy Finzel wrote:

On Mon, Feb 17, 2020 at 8:21 AM Tom Lane <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>> wrote:

Jeremy Finzel <finzelj@gmail.com <mailto:finzelj@gmail.com>> writes:

I have a table foo with 100 million rows, and a column:
    - id character varying(20)
The following command is the one that we expect to execute very

quickly (we

are not seeing any locking), but it is instead taking a very long time:
    - ALTER TABLE foo ALTER COLUMN id TYPE varchar(100);

Hm, the code is supposed to avoid a table rewrite, but I wonder if
there's something else that's not being avoided, such as an index
rebuild or foreign-key verification.  Could we see the whole table
definition, eg from psql \d+ ?

                        regards, tom lane

Based on your feedback, I quickly identified that indeed, the following
index is causing the re-type to be slow:

"id_idx" btree ("substring"(id::text, 4, 7))

I'm still not sure why a rebuild of this index would be required,
technically speaking.  But perhaps in any case the docs should have
something to the effect that expression indexes may require rebuild under
specific circumstances?

How much faster would "it" be if you dropped the index, ran ALTER and
rebuilt the index?  Or is it too late?

--
Angular momentum makes the world go 'round.

#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Jeremy Finzel (#3)
Re: Cases where alter table set type varchar(longer length) still needs table rewrite

On 2/17/20 7:01 AM, Jeremy Finzel wrote:

On Mon, Feb 17, 2020 at 8:21 AM Tom Lane <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>> wrote:

Jeremy Finzel <finzelj@gmail.com <mailto:finzelj@gmail.com>> writes:

I have a table foo with 100 million rows, and a column:
    - id character varying(20)
The following command is the one that we expect to execute very

quickly (we

are not seeing any locking), but it is instead taking a very long

time:

    - ALTER TABLE foo ALTER COLUMN id TYPE varchar(100);

Hm, the code is supposed to avoid a table rewrite, but I wonder if
there's something else that's not being avoided, such as an index
rebuild or foreign-key verification.  Could we see the whole table
definition, eg from psql \d+ ?

                        regards, tom lane

Based on your feedback, I quickly identified that indeed, the following
index is causing the re-type to be slow:

"id_idx" btree ("substring"(id::text, 4, 7))

I'm still not sure why a rebuild of this index would be required,
technically speaking.  But perhaps in any case the docs should have
something to the effect that expression indexes may require rebuild
under specific circumstances?

How about?:

https://www.postgresql.org/docs/9.6/sql-altertable.html

"Adding a column with a DEFAULT clause or changing the type of an
existing column will require the entire table and its indexes to be
rewritten. As an exception when changing the type of an existing column,
if the USING clause does not change the column contents and the old type
is either binary coercible to the new type or an unconstrained domain
over the new type, a table rewrite is not needed; but any indexes on the
affected columns must still be rebuilt. Adding or removing a system oid
column also requires rewriting the entire table. Table and/or index
rebuilds may take a significant amount of time for a large table; and
will temporarily require as much as double the disk space."

Thanks!
Jeremy

--
Adrian Klaver
adrian.klaver@aklaver.com

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Adrian Klaver (#5)
Re: Cases where alter table set type varchar(longer length) still needs table rewrite

Adrian Klaver <adrian.klaver@aklaver.com> writes:

On 2/17/20 7:01 AM, Jeremy Finzel wrote:

I'm still not sure why a rebuild of this index would be required,
technically speaking. But perhaps in any case the docs should have
something to the effect that expression indexes may require rebuild
under specific circumstances?

[ it is already ]

Yeah. In principle, we wouldn't need to rebuild the indexes in this
case, since there's no semantic difference between a value sourced
from a varchar(N) column and a varchar(some-other-N) column. In general,
though, ALTER COLUMN TYPE doesn't know whether that's true; and there
are definitely binary-compatible cases where it *does* matter.
(An example is that coercing integer to OID is allowed without a
table rewrite, but an index rebuild is needed because the sort
order is different.) So right now, any index mentioning the altered
column has to be rebuilt.

Maybe someday we'll figure out how to do better. I'm kind of wondering
whether it wouldn't be safe to assume that changes that only change the
typmod and not the type OID don't require index rebuilds.

regards, tom lane

#7Jeremy Finzel
finzelj@gmail.com
In reply to: Adrian Klaver (#5)
Re: Cases where alter table set type varchar(longer length) still needs table rewrite

On Mon, Feb 17, 2020 at 10:46 AM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

How about?:

https://www.postgresql.org/docs/9.6/sql-altertable.html

"Adding a column with a DEFAULT clause or changing the type of an
existing column will require the entire table and its indexes to be
rewritten. As an exception when changing the type of an existing column,
if the USING clause does not change the column contents and the old type
is either binary coercible to the new type or an unconstrained domain
over the new type, a table rewrite is not needed; but any indexes on the
affected columns must still be rebuilt. Adding or removing a system oid
column also requires rewriting the entire table. Table and/or index
rebuilds may take a significant amount of time for a large table; and
will temporarily require as much as double the disk space."

Thanks!
Jeremy

--
Adrian Klaver
adrian.klaver@aklaver.com

You mean the part "any indexes on the affected columns must still be
rebuilt"? Yes, I guess that is pretty clear. Thanks,

Jeremy