Re: ALTER TABLE .. ALTER COLUMN .. ERROR: attribute .. has wrong type

Started by Justin Pryzbyover 9 years ago30 messageshackers
Jump to latest
#1Justin Pryzby
pryzby@telsasoft.com

I originally sent to psql-general some months ago, but it appears it was never
delivered (perhaps I wasn't properly subscribed?).

Failed to alter table eric_umts_rnc_utrancell_metrics: ERROR: attribute 361 has wrong type
DETAIL: Table has type integer, but query expects smallint.

We've seen this at least 4 times now, on PG95 and 9.6; 3 of those times are for
the above table.

Any ideas what I can do to either reproduce it or otherwise avoid it ?

On Wed, Oct 12, 2016 at 10:25:05AM -0500, Justin Pryzby wrote:

We've seen this happen at least once on a 9.5 server, and twice on (the same)
server since its upgrade last week to 9.6:

ALTER TABLE t ALTER column TYPE says: "ERROR: attribute 81 has wrong type".

Just now under 9.6
DETAIL: Table has type integer, but query expects smallint
...
ts=# SELECT attnum, atttypid, attrelid::regclass FROM pg_attribute WHERE attname='pmnopagingattemptutranrejected' ORDER BY 1 DESC,2,3;
attnum | atttypid | attrelid
--------+----------+---------------------------------
193 | 21 | eric_umts_rnc_utrancell_metrics
193 | 21 | eric_umts_rnc_utrancell_201508
179 | 21 | eric_umts_rnc_utrancell_201509
179 | 21 | eric_umts_rnc_utrancell_201510
179 | 21 | eric_umts_rnc_utrancell_201511
179 | 21 | eric_umts_rnc_utrancell_201602
[...]
179 | 21 | eric_umts_rnc_utrancell_201610
179 | 21 | eric_umts_rnc_utrancell_201611
(17 rows)

Last week (same server, same table, still 9.6):
DETAIL: Table has type real, but query expects smallint

In July (different server) under 9.5
DETAIL: Table has type real, but query expects smallint
...
SELECT atttypid, attnum, attrelid::regclass FROM pg_attribute WHERE attname='c_84150886'
atttypid | attnum | attrelid
----------+--------+-----------------------------
21 | 200 | huawei_msc_trunkgrp_201605
21 | 200 | huawei_msc_trunkgrp_201604
21 | 200 | huawei_msc_trunkgrp_201603
21 | 200 | huawei_msc_trunkgrp_201602
21 | 200 | huawei_msc_trunkgrp_201512
21 | 200 | huawei_msc_trunkgrp_201511
21 | 200 | huawei_msc_trunkgrp_201510
21 | 200 | huawei_msc_trunkgrp_201508
21 | 200 | huawei_msc_trunkgrp_201507
21 | 200 | huawei_msc_trunkgrp_201506
21 | 200 | huawei_msc_trunkgrp_201505
21 | 200 | huawei_msc_trunkgrp_201607
21 | 200 | huawei_msc_trunkgrp_201606
21 | 200 | huawei_msc_trunkgrp_201608
21 | 201 | huawei_msc_trunkgrp_metrics
21 | 200 | huawei_msc_trunkgrp_201509
21 | 200 | huawei_msc_trunkgrp_201601
(17 rows)

I don't have a clear recollection how I solved this in July; possibly by
restoring the (historic, partition) table from backup.

Last week again again just now (both under 9.6), a colleague found that he was
able to avoid the error by ALTER TYPE without USING.

Note that we ALTER TABLE .. NO INHERIT the partitions for all but the most
recent 2 months before ALTERing them (or the parent). The "ALTER NO INHERIT"
and the ALTER TYPE of historic partitions are done outside of a transaction in
order to avoid large additional disk use otherwise used when ALTERing a parent
with many or large children (the sum of the size of the children).

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#1)

On Wed, Oct 12, 2016 at 10:25:05AM -0500, Justin Pryzby wrote:

I don't have a clear recollection how I solved this in July; possibly by
restoring the (historic, partition) table from backup.

Last week again again just now (both under 9.6), a colleague found that he was
able to avoid the error by ALTER TYPE without USING.

Note that we ALTER TABLE .. NO INHERIT the partitions for all but the most
recent 2 months before ALTERing them (or the parent). The "ALTER NO INHERIT"
and the ALTER TYPE of historic partitions are done outside of a transaction in
order to avoid large additional disk use otherwise used when ALTERing a parent
with many or large children (the sum of the size of the children).

Here's DETAILs for a 2nd such error which has shown up today:

(EricssonUtranXmlParser): Failed to alter table eric_umts_rnc_utrancell_metrics: ERROR: attribute 424 has wrong type
DETAIL: Table has type smallint, but query expects integer.

(EricssonUtranXmlParser): Failed to alter table eric_umts_rnc_utrancell_metrics: ERROR: attribute 361 has wrong type
DETAIL: Table has type integer, but query expects smallint.

Also, note both alters really do work without "USING":

ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;
BEGIN
DROP VIEW
ERROR: attribute 424 has wrong type
DETAIL: Table has type smallint, but query expects integer.
ts=#

ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT ;
BEGIN
DROP VIEW
ALTER TABLE
ts=#

Is it useful to send something from pg_attribute, or other clues ??

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Robert Haas
robertmhaas@gmail.com
In reply to: Justin Pryzby (#2)

On Mon, Jan 2, 2017 at 7:32 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Wed, Oct 12, 2016 at 10:25:05AM -0500, Justin Pryzby wrote:

I don't have a clear recollection how I solved this in July; possibly by
restoring the (historic, partition) table from backup.

Last week again again just now (both under 9.6), a colleague found that he was
able to avoid the error by ALTER TYPE without USING.

Note that we ALTER TABLE .. NO INHERIT the partitions for all but the most
recent 2 months before ALTERing them (or the parent). The "ALTER NO INHERIT"
and the ALTER TYPE of historic partitions are done outside of a transaction in
order to avoid large additional disk use otherwise used when ALTERing a parent
with many or large children (the sum of the size of the children).

Here's DETAILs for a 2nd such error which has shown up today:

(EricssonUtranXmlParser): Failed to alter table eric_umts_rnc_utrancell_metrics: ERROR: attribute 424 has wrong type
DETAIL: Table has type smallint, but query expects integer.

(EricssonUtranXmlParser): Failed to alter table eric_umts_rnc_utrancell_metrics: ERROR: attribute 361 has wrong type
DETAIL: Table has type integer, but query expects smallint.

Also, note both alters really do work without "USING":

ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;
BEGIN
DROP VIEW
ERROR: attribute 424 has wrong type
DETAIL: Table has type smallint, but query expects integer.
ts=#

ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT ;
BEGIN
DROP VIEW
ALTER TABLE
ts=#

Is it useful to send something from pg_attribute, or other clues ??

So, are these errors reproducible? Like, if you create a brand new
cluster with initdb and a brand new database with createdb and you use
CREATE VIEW to recreate the tables and views and then do this, does
the error reliably happen? Or is this problem unique to your existing
database but it doesn't happen on a new one? If it doesn't reproduce
on a new database, does it reproduce consistently on the existing
database or is that also intermittent?

If nothing else, I'd say the error message is very poor. But there
might be an actual bug here, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Justin Pryzby
pryzby@telsasoft.com
In reply to: Robert Haas (#3)

On Tue, Jan 03, 2017 at 11:45:33AM -0500, Robert Haas wrote:

ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;
BEGIN
DROP VIEW
ERROR: attribute 424 has wrong type
DETAIL: Table has type smallint, but query expects integer.
ts=#

ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT ;
BEGIN
DROP VIEW
ALTER TABLE
ts=#

Is it useful to send something from pg_attribute, or other clues ??

So, are these errors reproducible? Like, if you create a brand new

I can cause the error at will on the existing table, but I wouldn't know how to
reproduce the problem on a new table/database. I'm guessing it has something
to do with dropped columns or historic alters (which I mentioned are typically
done separately on child tables vs their parent).

Since it's happened 3 times now on this table, but not others on this database,
I would guess it's an "data issue", possibly related to pg_upgrades. IOW it
may be impossible to get into this state from a fresh initdb from a current
version.

I considered that perhaps it only affected our oldest tables, and would stop
happening once they were dropped, but note this ALTER is only of a parent and
its 3 most recent children. So only the empty parent could be described as
"old".

Justin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Robert Haas
robertmhaas@gmail.com
In reply to: Justin Pryzby (#4)

On Tue, Jan 3, 2017 at 11:59 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Tue, Jan 03, 2017 at 11:45:33AM -0500, Robert Haas wrote:

ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;
BEGIN
DROP VIEW
ERROR: attribute 424 has wrong type
DETAIL: Table has type smallint, but query expects integer.
ts=#

ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT ;
BEGIN
DROP VIEW
ALTER TABLE
ts=#

Is it useful to send something from pg_attribute, or other clues ??

So, are these errors reproducible? Like, if you create a brand new

I can cause the error at will on the existing table, but I wouldn't know how to
reproduce the problem on a new table/database. I'm guessing it has something
to do with dropped columns or historic alters (which I mentioned are typically
done separately on child tables vs their parent).

Since it's happened 3 times now on this table, but not others on this database,
I would guess it's an "data issue", possibly related to pg_upgrades. IOW it
may be impossible to get into this state from a fresh initdb from a current
version.

I considered that perhaps it only affected our oldest tables, and would stop
happening once they were dropped, but note this ALTER is only of a parent and
its 3 most recent children. So only the empty parent could be described as
"old".

Just for kicks, could you try running pg_catcheck on the affected system?

https://github.com/EnterpriseDB/pg_catcheck

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Justin Pryzby
pryzby@telsasoft.com
In reply to: Robert Haas (#5)

On Tue, Jan 03, 2017 at 01:40:50PM -0500, Robert Haas wrote:

On Tue, Jan 3, 2017 at 11:59 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Tue, Jan 03, 2017 at 11:45:33AM -0500, Robert Haas wrote:

ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;
BEGIN
DROP VIEW
ERROR: attribute 424 has wrong type
DETAIL: Table has type smallint, but query expects integer.
ts=#

I can cause the error at will on the existing table, but I wouldn't know how to
reproduce the problem on a new table/database. I'm guessing it has something

Just for kicks, could you try running pg_catcheck on the affected system?

https://github.com/EnterpriseDB/pg_catcheck

Neat, I hadn't heard of it before ;)

The version in PGDG has the "amkeytype" issue, so I compiled,

I got this:

[pryzbyj@database pg_catcheck]$ ./pg_catcheck ts
notice: pg_shdepend row has invalid classid "2613": not a system catalog OID
row identity: dbid="16402" classid="2613" objid="1086583699" objsubid="0" refclassid="1260" refobjid="16384" deptype="o"
notice: pg_shdepend row has invalid classid "2613": not a system catalog OID
row identity: dbid="16402" classid="2613" objid="1086583701" objsubid="0" refclassid="1260" refobjid="16384" deptype="o"
[...]

notice: pg_depend row has invalid objid "1124153791": no matching entry in pg_class
row identity: classid="1259" objid="1124153791" objsubid="0" refclassid="1259" refobjid="1064197368" refobjsubid="1" deptype="a"

progress: done (294 inconsistencies, 0 warnings, 0 errors)

.. those are the only two problem oids:
[pryzbyj@database pg_catcheck]$ time ./pg_catcheck ts 2>&1 |grep -Evw '2613|1259'
progress: done (264 inconsistencies, 0 warnings, 0 errors)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Pryzby (#6)

Justin Pryzby <pryzby@telsasoft.com> writes:

I can cause the error at will on the existing table,

That's good news, at least.

1. Please trigger it with "\set VERBOSITY verbose" enabled, so we can see
the exact source location --- there are a couple of instances of that
text.

2. Even better would be a stack trace for the call to errfinish,
https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

3. It's pretty hard to see how you'd reach any of these places for an
ALTER COLUMN TYPE on a simple table. Has the table got rules, triggers,
default values? Could we see "\d+" output for it?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Justin Pryzby
pryzby@telsasoft.com
In reply to: Tom Lane (#7)

On Tue, Jan 03, 2017 at 02:32:36PM -0500, Tom Lane wrote:

Justin Pryzby <pryzby@telsasoft.com> writes:

I can cause the error at will on the existing table,

That's good news, at least.

1. Please trigger it with "\set VERBOSITY verbose" enabled, so we can see
the exact source location --- there are a couple of instances of that
text.

ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;
BEGIN
DROP VIEW
ERROR: 42804: attribute 424 has wrong type
DETAIL: Table has type smallint, but query expects integer.
LOCATION: ExecEvalScalarVar, execQual.c:660

2. Even better would be a stack trace for the call to errfinish,
https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

#1 0x00000000006dd39f in exec_simple_query (query_string=0x1fc5fb0 "begin;")
at postgres.c:932
dest = DestRemote
oldcontext = 0x1f3b100
parsetree_list = 0x1fc69f0
save_log_statement_stats = 0 '\000'
was_logged = 0 '\000'
msec_str = "\360:\235\213\377\177\000\000`<\235\213\377\177\000\000\260_\374\001", '\000' <repeats 11 times>
__func__ = "exec_simple_query"

and then

#1 0x00000000006dd39f in exec_simple_query (
query_string=0x1fc5fb0 "ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;")
at postgres.c:932
dest = DestRemote
oldcontext = 0x1f3b100
parsetree_list = 0x1fc6fc8
save_log_statement_stats = 0 '\000'
was_logged = 0 '\000'
msec_str = "\360:\235\213\377\177\000\000`<\235\213\377\177\000\000\260_\374\001", '\000' <repeats 11 times>
__func__ = "exec_simple_query"

then

#1 0x00000000005d0e30 in ExecEvalScalarVar (exprstate=<value optimized out>,
econtext=<value optimized out>, isNull=<value optimized out>,
isDone=<value optimized out>) at execQual.c:655
attnum = 424
__func__ = "ExecEvalScalarVar"

3. It's pretty hard to see how you'd reach any of these places for an
ALTER COLUMN TYPE on a simple table. Has the table got rules, triggers,
default values? Could we see "\d+" output for it?

triggers and defaults, yes.

sect_id | integer | not null
start_time | timestamp with time zone | not null
site_id | integer | not null
interval_seconds | smallint | not null
utrancell | text | not null
nedn | text | not null
rnc_id | integer | not null
device_id | integer | not null
pmcelldowntimeauto | smallint |
pmcelldowntimeman | smallint |
pmchswitchattemptfachura | smallint |
pmchswitchattempturafach | smallint |
...
Triggers:
eric_umts_rnc_utrancell_insert_trigger BEFORE INSERT ON eric_umts_rnc_utrancell_metrics FOR EACH ROW EXECUTE PROCEDURE eric_umts_rnc_utrancell_insert_function()
Number of child tables: 3 (Use \d+ to list them.)

I'll send the rest of \d if you really want but:

ts=# SELECT COUNT(1) FROM pg_attribute WHERE attrelid='eric_umts_rnc_utrancell_metrics'::regclass;
count | 1116

Justin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Justin Pryzby
pryzby@telsasoft.com
In reply to: Tom Lane (#7)

On Tue, Jan 03, 2017 at 02:32:36PM -0500, Tom Lane wrote:

3. It's pretty hard to see how you'd reach any of these places for an
ALTER COLUMN TYPE on a simple table. Has the table got rules, triggers,
default values? Could we see "\d+" output for it?

I really meant to do \d+..

Table "public.eric_umts_rnc_utrancell_metrics"
Column | Type | Modifiers | Storage | Stats target | Description
---------------------------------------------+--------------------------+-----------+----------+--------------+-------------
sect_id | integer | not null | plain | 400 |
start_time | timestamp with time zone | not null | plain | 400 |
site_id | integer | not null | plain | 400 |
interval_seconds | smallint | not null | plain | 200 |
utrancell | text | not null | extended | 200 |
nedn | text | not null | extended | 200 |
rnc_id | integer | not null | plain | 400 |
device_id | integer | not null | plain | 200 |
pmcelldowntimeauto | smallint | | plain | 10 |
pmcelldowntimeman | smallint | | plain | 10 |
[...]

Justin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Pryzby (#8)

Justin Pryzby <pryzby@telsasoft.com> writes:

On Tue, Jan 03, 2017 at 02:32:36PM -0500, Tom Lane wrote:

2. Even better would be a stack trace for the call to errfinish,
https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

Thanks, but we need the whole call stack, or at least the first dozen or
so levels. "bt" in gdb would do.

I'll send the rest of \d if you really want but:

ts=# SELECT COUNT(1) FROM pg_attribute WHERE attrelid='eric_umts_rnc_utrancell_metrics'::regclass;
count | 1116

Well, we don't know what we're looking for, so assuming that there's
nothing of interest there is probably bad.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Justin Pryzby
pryzby@telsasoft.com
In reply to: Tom Lane (#10)

On Tue, Jan 03, 2017 at 02:50:21PM -0500, Tom Lane wrote:

Justin Pryzby <pryzby@telsasoft.com> writes:

On Tue, Jan 03, 2017 at 02:32:36PM -0500, Tom Lane wrote:

2. Even better would be a stack trace for the call to errfinish,
https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

Thanks, but we need the whole call stack, or at least the first dozen or
so levels. "bt" in gdb would do.

#0 errfinish (dummy=0) at elog.c:414
#1 0x00000000006dd39f in exec_simple_query (query_string=0x1fc5fb0 "ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;") at postgres.c:932
#2 0x00000000006dec8c in PostgresMain (argc=<value optimized out>, argv=<value optimized out>, dbname=0x1f65d98 "ts", username=<value optimized out>) at postgres.c:4070
#3 0x000000000067f2c5 in BackendRun (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:4270
#4 BackendStartup (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:3944
#5 ServerLoop (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1701
#6 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1309
#7 0x0000000000607658 in main (argc=3, argv=0x1f3a4f0) at main.c:228

(gdb) bt
#0 errfinish (dummy=0) at elog.c:414
#1 0x00000000005d0e30 in ExecEvalScalarVar (exprstate=<value optimized out>, econtext=<value optimized out>, isNull=<value optimized out>, isDone=<value optimized out>) at execQual.c:655
#2 0x00000000005d0c3c in ExecMakeFunctionResultNoSets (fcache=0x21f18a0, econtext=0x2199e80, isNull=0x21e90ee "", isDone=<value optimized out>) at execQual.c:2015
#3 0x000000000059d5ce in ATRewriteTable (tab=<value optimized out>, OIDNewHeap=<value optimized out>, lockmode=<value optimized out>) at tablecmds.c:4152
#4 0x00000000005a92fc in ATRewriteTables (parsetree=0x1f63b20, rel=<value optimized out>, cmds=<value optimized out>, recurse=<value optimized out>, lockmode=<value optimized out>) at tablecmds.c:3858
#5 ATController (parsetree=0x1f63b20, rel=<value optimized out>, cmds=<value optimized out>, recurse=<value optimized out>, lockmode=<value optimized out>) at tablecmds.c:3104
#6 0x00000000006e25e6 in ProcessUtilitySlow (parsetree=0x1fc6f78, queryString=0x1fc5fb0 "ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=<value optimized out>, completionTag=0x7fff8b9d3a90 "") at utility.c:1085
#7 0x00000000006e2a70 in standard_ProcessUtility (parsetree=0x1fc6f78,
queryString=0x1fc5fb0 "ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x1fc72b8,
completionTag=0x7fff8b9d3a90 "") at utility.c:907
#8 0x00000000006df2cc in PortalRunUtility (portal=0x1fff2e0, utilityStmt=0x1fc6f78, isTopLevel=1 '\001', setHoldSnapshot=<value optimized out>, dest=0x1fc72b8, completionTag=0x7fff8b9d3a90 "") at pquery.c:1193
#9 0x00000000006e01cb in PortalRunMulti (portal=0x1fff2e0, isTopLevel=1 '\001', setHoldSnapshot=0 '\000', dest=0x1fc72b8, altdest=0x1fc72b8, completionTag=0x7fff8b9d3a90 "") at pquery.c:1349
#10 0x00000000006e0934 in PortalRun (portal=0x1fff2e0, count=9223372036854775807, isTopLevel=1 '\001', dest=0x1fc72b8, altdest=0x1fc72b8, completionTag=0x7fff8b9d3a90 "") at pquery.c:815
#11 0x00000000006dd5b1 in exec_simple_query (query_string=0x1fc5fb0 "ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;") at postgres.c:1094
#12 0x00000000006dec8c in PostgresMain (argc=<value optimized out>, argv=<value optimized out>, dbname=0x1f65d98 "ts", username=<value optimized out>) at postgres.c:4070
#13 0x000000000067f2c5 in BackendRun (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:4270
#14 BackendStartup (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:3944
#15 ServerLoop (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1701
#16 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1309
#17 0x0000000000607658 in main (argc=3, argv=0x1f3a4f0) at main.c:228

I'll send the rest of \d if you really want but:

Well, we don't know what we're looking for, so assuming that there's
nothing of interest there is probably bad.

Attached

Justin

Attachments:

alter-wrong-type-dplus.gzapplication/octet-streamDownload
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Pryzby (#11)

Justin Pryzby <pryzby@telsasoft.com> writes:

(gdb) bt
#0 errfinish (dummy=0) at elog.c:414
#1 0x00000000005d0e30 in ExecEvalScalarVar (exprstate=<value optimized out>, econtext=<value optimized out>, isNull=<value optimized out>, isDone=<value optimized out>) at execQual.c:655
#2 0x00000000005d0c3c in ExecMakeFunctionResultNoSets (fcache=0x21f18a0, econtext=0x2199e80, isNull=0x21e90ee "", isDone=<value optimized out>) at execQual.c:2015
#3 0x000000000059d5ce in ATRewriteTable (tab=<value optimized out>, OIDNewHeap=<value optimized out>, lockmode=<value optimized out>) at tablecmds.c:4152
#4 0x00000000005a92fc in ATRewriteTables (parsetree=0x1f63b20, rel=<value optimized out>, cmds=<value optimized out>, recurse=<value optimized out>, lockmode=<value optimized out>) at tablecmds.c:3858
#5 ATController (parsetree=0x1f63b20, rel=<value optimized out>, cmds=<value optimized out>, recurse=<value optimized out>, lockmode=<value optimized out>) at tablecmds.c:3104
#6 0x00000000006e25e6 in ProcessUtilitySlow (parsetree=0x1fc6f78, queryString=0x1fc5fb0 "ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=<value optimized out>, completionTag=0x7fff8b9d3a90 "") at utility.c:1085

Okay, so it's clearly processing the USING expression and not something
else, which is weird because that should've just been parsed against the
existing table column; how could that Var contain the wrong type?

I'm wondering if this represents some sort of out-of-sync condition
between the table and its child tables. We can't actually tell from
this trace which table is being processed. Could you try, from this
breakpoint,

f 3
p oldrel->rd_rel->relname

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#12)

I wrote:

I'm wondering if this represents some sort of out-of-sync condition
between the table and its child tables.

Hah:

regression=# create table p(f1 int);
CREATE TABLE
regression=# create table c1(extra smallint) inherits(p);
CREATE TABLE
regression=# alter table p add column f2 int;
ALTER TABLE
regression=# insert into c1 values(1,2,3);
INSERT 0 1
regression=# alter table p alter column f2 type bigint using f2::bigint;
ERROR: attribute 2 has wrong type
DETAIL: Table has type smallint, but query expects integer.

Of course, in c1 the target column is #3 not #2. The USING expression
isn't being adjusted for the discrepancy between parent and child column
numbers.

This test case works before 9.5; somebody must have broke it while
refactoring.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Justin Pryzby
pryzby@telsasoft.com
In reply to: Tom Lane (#12)

On Tue, Jan 03, 2017 at 03:18:15PM -0500, Tom Lane wrote:

Justin Pryzby <pryzby@telsasoft.com> writes:

(gdb) bt
#3 0x000000000059d5ce in ATRewriteTable (tab=<value optimized out>, OIDNewHeap=<value optimized out>, lockmode=<value optimized out>) at tablecmds.c:4152

I'm wondering if this represents some sort of out-of-sync condition
between the table and its child tables. We can't actually tell from
this trace which table is being processed. Could you try, from this
breakpoint,

f 3
p oldrel->rd_rel->relname

(gdb) p oldrel->rd_rel->relname
$1 = {data = "eric_umts_rnc_utrancell_201701", '\000' <repeats 33 times>}

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Pryzby (#14)

Justin Pryzby <pryzby@telsasoft.com> writes:

On Tue, Jan 03, 2017 at 03:18:15PM -0500, Tom Lane wrote:

I'm wondering if this represents some sort of out-of-sync condition
between the table and its child tables. We can't actually tell from
this trace which table is being processed. Could you try, from this
breakpoint,

f 3
p oldrel->rd_rel->relname

(gdb) p oldrel->rd_rel->relname
$1 = {data = "eric_umts_rnc_utrancell_201701", '\000' <repeats 33 times>}

Right. So I bet that if you check the attnum of pmsumpacketlatency_000 in
eric_umts_rnc_utrancell_metrics, you'll find it's different from that in
eric_umts_rnc_utrancell_201701, and that the attribute having that attnum
in eric_umts_rnc_utrancell_201701 has type smallint not int.

This is an expected situation in some situations where you ALTER existing
inheritance hierarchies; it's a bug that ALTER COLUMN is failing to cope.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Justin Pryzby
pryzby@telsasoft.com
In reply to: Tom Lane (#15)

On Tue, Jan 03, 2017 at 03:35:34PM -0500, Tom Lane wrote:

Justin Pryzby <pryzby@telsasoft.com> writes:

On Tue, Jan 03, 2017 at 03:18:15PM -0500, Tom Lane wrote:

I'm wondering if this represents some sort of out-of-sync condition
between the table and its child tables. We can't actually tell from
this trace which table is being processed. Could you try, from this
breakpoint,

f 3
p oldrel->rd_rel->relname

(gdb) p oldrel->rd_rel->relname
$1 = {data = "eric_umts_rnc_utrancell_201701", '\000' <repeats 33 times>}

Right. So I bet that if you check the attnum of pmsumpacketlatency_000 in
eric_umts_rnc_utrancell_metrics, you'll find it's different from that in
eric_umts_rnc_utrancell_201701, and that the attribute having that attnum
in eric_umts_rnc_utrancell_201701 has type smallint not int.

I think that's consistent with what your understanding:

ts=# SELECT attrelid::regclass, attname, attnum, atttypid FROM pg_attribute WHERE attrelid::regclass::text~'eric_umts_rnc_utrancell_(metrics|201701)$' AND (attname='pmsumpacketlatency_000' OR attnum IN (367,424) ) ORDER BY 1,2;
eric_umts_rnc_utrancell_metrics | pmsamplespshsadchrabestablish | 367 | 21
eric_umts_rnc_utrancell_metrics | pmsumpacketlatency_000 | 424 | 23
eric_umts_rnc_utrancell_201701 | pmsumpacketlatency_000 | 367 | 23
eric_umts_rnc_utrancell_201701 | pmulupswitchsuccessmedium | 424 | 21

Justin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Pryzby (#16)

Justin Pryzby <pryzby@telsasoft.com> writes:

On Tue, Jan 03, 2017 at 03:35:34PM -0500, Tom Lane wrote:

Right. So I bet that if you check the attnum of pmsumpacketlatency_000 in
eric_umts_rnc_utrancell_metrics, you'll find it's different from that in
eric_umts_rnc_utrancell_201701, and that the attribute having that attnum
in eric_umts_rnc_utrancell_201701 has type smallint not int.

I think that's consistent with what your understanding:

ts=# SELECT attrelid::regclass, attname, attnum, atttypid FROM pg_attribute WHERE attrelid::regclass::text~'eric_umts_rnc_utrancell_(metrics|201701)$' AND (attname='pmsumpacketlatency_000' OR attnum IN (367,424) ) ORDER BY 1,2;
eric_umts_rnc_utrancell_metrics | pmsamplespshsadchrabestablish | 367 | 21
eric_umts_rnc_utrancell_metrics | pmsumpacketlatency_000 | 424 | 23
eric_umts_rnc_utrancell_201701 | pmsumpacketlatency_000 | 367 | 23
eric_umts_rnc_utrancell_201701 | pmulupswitchsuccessmedium | 424 | 21

Yup. So if you can't wait for a fix, your best bet would be to dump and
reload these tables, which should bring their attnums back in sync.
(Of course, they might not stay that way for long, if you're also
in the habit of adding columns often.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#13)

I wrote:

Hah:

regression=# create table p(f1 int);
CREATE TABLE
regression=# create table c1(extra smallint) inherits(p);
CREATE TABLE
regression=# alter table p add column f2 int;
ALTER TABLE
regression=# insert into c1 values(1,2,3);
INSERT 0 1
regression=# alter table p alter column f2 type bigint using f2::bigint;
ERROR: attribute 2 has wrong type
DETAIL: Table has type smallint, but query expects integer.

Of course, in c1 the target column is #3 not #2. The USING expression
isn't being adjusted for the discrepancy between parent and child column
numbers.

This test case works before 9.5; somebody must have broke it while
refactoring.

A little bit of "git bisect"-ing later, the blame is pinned on

commit 9550e8348b7965715789089555bb5a3fda8c269c
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Fri Apr 3 17:33:05 2015 -0300

Transform ALTER TABLE/SET TYPE/USING expr during parse analysis

This lets later stages have access to the transformed expression; in
particular it allows DDL-deparsing code during event triggers to pass
the transformed expression to ruleutils.c, so that the complete command
can be deparsed.

This shuffles the timing of the transform calls a bit: previously,
nothing was transformed during parse analysis, and only the
RELKIND_RELATION case was being handled during execution. After this
patch, all expressions are transformed during parse analysis (including
those for relkinds other than RELATION), and the error for other
relation kinds is thrown only during execution. So we do more work than
before to reject some bogus cases. That seems acceptable.

Of course, the reason why this work was postponed until execution was
exactly because we wanted to do it over again for each child table.

We could probably fix the specific issue being seen here by passing the
expression tree through a suitable attno remapping, but I am now filled
with dread about how much of the event trigger code may be naively
supposing that child tables have the same attnums as their parents.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#18)

Tom Lane wrote:

A little bit of "git bisect"-ing later, the blame is pinned on

commit 9550e8348b7965715789089555bb5a3fda8c269c
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Fri Apr 3 17:33:05 2015 -0300

Transform ALTER TABLE/SET TYPE/USING expr during parse analysis

This lets later stages have access to the transformed expression; in
particular it allows DDL-deparsing code during event triggers to pass
the transformed expression to ruleutils.c, so that the complete command
can be deparsed.

This shuffles the timing of the transform calls a bit: previously,
nothing was transformed during parse analysis, and only the
RELKIND_RELATION case was being handled during execution. After this
patch, all expressions are transformed during parse analysis (including
those for relkinds other than RELATION), and the error for other
relation kinds is thrown only during execution. So we do more work than
before to reject some bogus cases. That seems acceptable.

Of course, the reason why this work was postponed until execution was
exactly because we wanted to do it over again for each child table.

We could probably fix the specific issue being seen here by passing the
expression tree through a suitable attno remapping,

Hmm, ouch. I can look into fixing this starting tomorrow afternoon.

but I am now filled with dread about how much of the event trigger
code may be naively supposing that child tables have the same attnums
as their parents.

I guess it's on me to figure that out.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#18)

Tom Lane wrote:

We could probably fix the specific issue being seen here by passing the
expression tree through a suitable attno remapping,

Here's a first attempt at fixing this. It makes the test pass, but I
have the feeling that more complex ones might need more work. Have to
leave for a bit now.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

fix-altertype.patchtext/plain; charset=us-asciiDownload+88-4
#21Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#20)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#21)
#23Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#23)
#25Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#24)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#25)
#27Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#26)
#28Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#1)
#29David G. Johnston
david.g.johnston@gmail.com
In reply to: Justin Pryzby (#28)
#30Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Justin Pryzby (#28)