Assignment of valid collation for SET operations on queries with UNKNOWN types.

rahilasyed90@gmail.com

over 9 years ago

Following UNION of two queries with constant literals runs successfully.

CASE 1:
postgres=# SELECT 'abc' UNION SELECT 'bcd' ;
?column?
----------
abc
bcd
(2 rows)

whereas when these literals are part of a view, the UNION fails.

CASE 2:
postgres=# create view v as select 'abc' a;
2016-11-16 15:28:48 IST WARNING: column "a" has type "unknown"
2016-11-16 15:28:48 IST DETAIL: Proceeding with relation creation anyway.
WARNING: column "a" has type "unknown"
DETAIL: Proceeding with relation creation anyway.
CREATE VIEW

postgres=# create view v1 as select 'bcd' a;
2016-11-16 15:28:56 IST WARNING: column "a" has type "unknown"
2016-11-16 15:28:56 IST DETAIL: Proceeding with relation creation anyway.
WARNING: column "a" has type "unknown"
DETAIL: Proceeding with relation creation anyway.
CREATE VIEW

postgres=# select a from v UNION select a from v1;
2016-11-16 15:25:28 IST ERROR: could not determine which collation to use
for string comparison
2016-11-16 15:25:28 IST HINT: Use the COLLATE clause to set the collation
explicitly.
2016-11-16 15:25:28 IST STATEMENT: select a from v UNION select a from v1;
ERROR: could not determine which collation to use for string comparison
HINT: Use the COLLATE clause to set the collation explicitly.

When UNION of queries with constant literals as in CASE 1 is allowed
shouldn't a UNION of queries with literals in a view as in CASE 2 be
allowed?

In transformSetOperationTree, while determining the result type of the
merged
output columns, if the left and right column types are UNKNOWNs the result
type
is resolved to TEXT.

The difference of behaviour in above two cases arises because the result
collation
assigned is not valid in CASE 2.

When the left and the right inputs are literal constants i.e UNKNOWN as in
Case 1
the collation of result column is correctly assigned to a valid value.

Whereas when the left and the right inputs are columns of UNKNOWN type as
in Case 2,
the result collation is InvalidOid.

So if we ensure assignment of a valid collation when the left and the right
columns/inputs
are UNKNOWN, the above can be resolved.

Attached WIP patch does that. Kindly let me know your opinion.

Thank you,
Rahila Syed

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Rahila Syed (#1)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

Rahila Syed <rahilasyed90@gmail.com> writes:

CASE 2:
postgres=# create view v as select 'abc' a;
2016-11-16 15:28:48 IST WARNING: column "a" has type "unknown"
2016-11-16 15:28:48 IST DETAIL: Proceeding with relation creation anyway.
WARNING: column "a" has type "unknown"
DETAIL: Proceeding with relation creation anyway.
CREATE VIEW

We really ought to make that a hard error. And ideally fix things so
that the type of the view column will be resolved as text, so that you
don't hit this condition in the first place; but there is no good that
comes out of allowing a view to be created like this.

Attached WIP patch does that. Kindly let me know your opinion.

This is a seriously ugly kluge that's attacking the symptom not the
problem. Or really, a symptom not the problem. There are lots of
other symptoms, for instance

regression=# select * from v order by 1;
ERROR: failed to find conversion function from unknown to text

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

rahilasyed90@gmail.com

over 9 years ago

In reply to: Tom Lane (#2)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

Hello,

And ideally fix things so
that the type of the view column will be resolved as text, so that you
don't hit this condition in the first place; but there is no good that
comes out of allowing a view to be created like this

Thank you for suggestion. Attached is a patch which resolves the columns
with literal constants as TEXT for view creation.

Following views can be created with literal columns resolved as TEXT.

This allows following queries to run successfully which wasn't the case
earlier.

postgres=# select a from v UNION select a from v1;
a
-----
abc
def
(2 rows)

AND

postgres=# select * from v order by 1;
a
-----
abc
(1 row)

Kindly give your opinion.

Thank you,
Rahila Syed.

On Thu, Nov 17, 2016 at 8:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Rahila Syed <rahilasyed90@gmail.com> writes:

CASE 2:
postgres=# create view v as select 'abc' a;
2016-11-16 15:28:48 IST WARNING: column "a" has type "unknown"
2016-11-16 15:28:48 IST DETAIL: Proceeding with relation creation

anyway.

WARNING: column "a" has type "unknown"
DETAIL: Proceeding with relation creation anyway.
CREATE VIEW

We really ought to make that a hard error. And ideally fix things so
that the type of the view column will be resolved as text, so that you
don't hit this condition in the first place; but there is no good that
comes out of allowing a view to be created like this.

Attached WIP patch does that. Kindly let me know your opinion.

This is a seriously ugly kluge that's attacking the symptom not the
problem. Or really, a symptom not the problem. There are lots of
other symptoms, for instance

regression=# select * from v order by 1;
ERROR: failed to find conversion function from unknown to text

regards, tom lane

michael@paquier.xyz

over 9 years ago

In reply to: Rahila Syed (#3)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

On Tue, Dec 6, 2016 at 10:42 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:

Hello,

And ideally fix things so
that the type of the view column will be resolved as text, so that you
don't hit this condition in the first place; but there is no good that
comes out of allowing a view to be created like this

Thank you for suggestion. Attached is a patch which resolves the columns
with literal constants as TEXT for view creation.

Following views can be created with literal columns resolved as TEXT.

postgres=# create view v as select 'abc' a;
CREATE VIEW
postgres=# create view v1 as select 'def' a;
CREATE VIEW

There is a similar code pattern for materialized views, see
create_ctas_nodata() where the attribute list is built. Even with your
patch, I get that:
=# create materialized view m as select 'abc' a;
WARNING: 42P16: column "a" has type "unknown"
DETAIL: Proceeding with relation creation anyway.
LOCATION: CheckAttributeType, heap.c:499
SELECT 1
Time: 6.566 ms
=# select * from m order by 1;
ERROR: XX000: failed to find conversion function from unknown to text

Your patch has no regression tests, surely you want some to stress
this code path. And actually, shouldn't this be just a plain error?
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

michael@paquier.xyz

over 9 years ago

In reply to: Michael Paquier (#4)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

On Wed, Dec 7, 2016 at 4:24 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Tue, Dec 6, 2016 at 10:42 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:

Thank you for suggestion. Attached is a patch which resolves the columns
with literal constants as TEXT for view creation.

You may also want to add your patch to a CF so as it does not fall
into the cracks.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

rahilasyed90@gmail.com

over 9 years ago

In reply to: Michael Paquier (#4)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

Hello,

Thank you for comments.

There is a similar code pattern for materialized views, see
create_ctas_nodata() where the attribute list is built

create_ctas_nodata() is for creation of materialized views WITH NO DATA.
For other materialized views and CREATE TABLE AS, column definitions are
built in
intorel_startup() function which has different code from that of CREATE
VIEW which
the patch deals with.

Limiting the scope of the patch to include changing the type of literal
constants
to text only for plain views. Also, error out when column with UNKNOWN type
is
being created for other relations like tables and materialized views.

And actually, shouldn't this be just a plain error?

Changed it to error in the attached patch.

Your patch has no regression tests, surely you want some to stress
this code path

Added regression tests in the attached patch.

Also adding this patch to CF 2017-01

Thank you,
Rahila Syed

michael@paquier.xyz

over 9 years ago

In reply to: Rahila Syed (#6)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

On Wed, Dec 14, 2016 at 7:02 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:

There is a similar code pattern for materialized views, see
create_ctas_nodata() where the attribute list is built

create_ctas_nodata() is for creation of materialized views WITH NO DATA.
For other materialized views and CREATE TABLE AS, column definitions are
built in
intorel_startup() function which has different code from that of CREATE VIEW
which
the patch deals with.

Limiting the scope of the patch to include changing the type of literal
constants
to text only for plain views. Also, error out when column with UNKNOWN type
is
being created for other relations like tables and materialized views.

Matviews is the same way of thinking as views in terms of definition.
It is inconsistent to try to address a problem only partially if the
same behavior shows up in different code paths.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

ashutosh.bapat@enterprisedb.com

over 9 years ago

In reply to: Rahila Syed (#6)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

The way this patch has been written, it doesn't allow creating tables
with unknown type columns, which was allowed earlier. That breaks
backward compatibility. Users, who have created such tables will face
problems while loading dumps from earlier versions. pg_upgrade might
be an issue, but we may modify it to convert those columns to text.
Given that a column with unknown type is pretty much useless unless
casted to some other type, there may not be many such users out there.
But we should probably add a notice in release notes.

+-- check coercion of UNKNOWN type to text for literal constants
+create view v as select 'abc' a;
+create view v11 as select 'def' a;
+select a from v UNION select a from v11;
+

The comment says that it's testing coercion of unknown to text, but
nothing in the test guarantees that constant literals were casted to
text. The union could give the expected result if code merging the two
views knew how to handle unknown types. Instead \d v or \d v11 would
be a better test to test what the comment says. If we want to test
such a union the test is fine, but the comment needs to change.

For a materialized view you may have to modify
transformCreateTableAsStmt() to modify at targetlist of the query
similar to DefineVirtualRelation(), but I think that can be done as a
separate patch.

You might want to add some testcases to test the error report e.g.
(not necessarily in the same form) create view sv as select
relname::unknown from pg_class;
ERROR: column "relname" has type "unknown"

On Wed, Dec 14, 2016 at 3:32 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:

Hello,

Thank you for comments.

There is a similar code pattern for materialized views, see
create_ctas_nodata() where the attribute list is built

create_ctas_nodata() is for creation of materialized views WITH NO DATA.
For other materialized views and CREATE TABLE AS, column definitions are
built in
intorel_startup() function which has different code from that of CREATE VIEW
which
the patch deals with.

Limiting the scope of the patch to include changing the type of literal
constants
to text only for plain views. Also, error out when column with UNKNOWN type
is
being created for other relations like tables and materialized views.

And actually, shouldn't this be just a plain error?

Changed it to error in the attached patch.

Your patch has no regression tests, surely you want some to stress
this code path

Added regression tests in the attached patch.

Also adding this patch to CF 2017-01

Thank you,
Rahila Syed

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Ashutosh Bapat (#8)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:

The way this patch has been written, it doesn't allow creating tables
with unknown type columns, which was allowed earlier.

Yes, that's an intentional change; creating such tables (or views) has
never been anything but a foot-gun.

However, I thought the idea was to silently coerce affected columns from
unknown to text. This doesn't look like the behavior we want:

You might want to add some testcases to test the error report e.g.
(not necessarily in the same form) create view sv as select
relname::unknown from pg_class;
ERROR: column "relname" has type "unknown"

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

ashutosh.bapat@enterprisedb.com

over 9 years ago

In reply to: Tom Lane (#9)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

On Thu, Dec 29, 2016 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:

The way this patch has been written, it doesn't allow creating tables
with unknown type columns, which was allowed earlier.

Yes, that's an intentional change; creating such tables (or views) has
never been anything but a foot-gun.

However, I thought the idea was to silently coerce affected columns from
unknown to text. This doesn't look like the behavior we want:

Do you mean to say that when creating a table with a column of unknown
type, that column type should be silently converted (there's nothing
to coerce when the table is being created) to text? instead of
throwing an error?

The patch does that when creating a view with constant literals, which
are known to be binary compatible with text and hence coercible. It
looks like any "unknown' type data should be coercible to text, so it
shouldn't matter whether those are constants or non-constant nodes.

You might want to add some testcases to test the error report e.g.
(not necessarily in the same form) create view sv as select
relname::unknown from pg_class;
ERROR: column "relname" has type "unknown"

regards, tom lane

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

michael@paquier.xyz

over 9 years ago

In reply to: Ashutosh Bapat (#10)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

On Fri, Dec 30, 2016 at 1:30 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

On Thu, Dec 29, 2016 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:

The way this patch has been written, it doesn't allow creating tables
with unknown type columns, which was allowed earlier.

Yes, that's an intentional change; creating such tables (or views) has
never been anything but a foot-gun.

However, I thought the idea was to silently coerce affected columns from
unknown to text. This doesn't look like the behavior we want:

Do you mean to say that when creating a table with a column of unknown
type, that column type should be silently converted (there's nothing
to coerce when the table is being created) to text? instead of
throwing an error?

FWIW that's what I understood: the patch should switch unknown columns
to text. A bunch of side effects when converting types are avoided
this way.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

rahilasyed90@gmail.com

over 9 years ago

In reply to: Michael Paquier (#11)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

Thank you all for inputs.
Kindly help me clarify the scope of the patch.

However, I thought the idea was to silently coerce affected columns from
unknown to text. This doesn't look like the behavior we want:

This patch prevents creation of relation with unknown columns and
in addition fixes the particular case of CREATE VIEW with literal columns
by coercing unknown to text only in this particular case.

Are you suggesting extending the patch to include coercing from unknown to
text for all possible cases where a column of unknown type is being created?

Thank you,
Rahila Syed

On Fri, Dec 30, 2016 at 7:21 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:

Show quoted text

On Fri, Dec 30, 2016 at 1:30 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

On Thu, Dec 29, 2016 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:

The way this patch has been written, it doesn't allow creating tables
with unknown type columns, which was allowed earlier.

Yes, that's an intentional change; creating such tables (or views) has
never been anything but a foot-gun.

However, I thought the idea was to silently coerce affected columns from
unknown to text. This doesn't look like the behavior we want:

Do you mean to say that when creating a table with a column of unknown
type, that column type should be silently converted (there's nothing
to coerce when the table is being created) to text? instead of
throwing an error?

FWIW that's what I understood: the patch should switch unknown columns
to text. A bunch of side effects when converting types are avoided
this way.
--
Michael

#13

ashutosh.bapat@enterprisedb.com

over 9 years ago

In reply to: Rahila Syed (#12)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

On Tue, Jan 3, 2017 at 5:57 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:

Thank you all for inputs.
Kindly help me clarify the scope of the patch.

However, I thought the idea was to silently coerce affected columns from
unknown to text. This doesn't look like the behavior we want:

This patch prevents creation of relation with unknown columns and
in addition fixes the particular case of CREATE VIEW with literal columns
by coercing unknown to text only in this particular case.

Are you suggesting extending the patch to include coercing from unknown to
text for all possible cases where a column of unknown type is being created?

I guess, that's what Tom is suggesting.

We need to do something to avoid throwing an error in the cases which
do not throw error in earlier releases.

We may just leave that warning as warning for now, and just tackle the
view case in this patch. I would like everything being done in one
patch, rather than this piece-meal approach. But delaying fix for
views because it takes longer to work on other pieces may not be good
either.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Ashutosh Bapat (#13)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:

On Tue, Jan 3, 2017 at 5:57 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:

Are you suggesting extending the patch to include coercing from unknown to
text for all possible cases where a column of unknown type is being created?

I guess, that's what Tom is suggesting.

Yes; I think the point is we should change the semantics we assume for
"SELECT 'foo'", not only for views but across the board. There are plenty
of non-view-related cases where it doesn't behave well, for example

regression=# select * from
(select 'foo' from generate_series(1,3)) ss order by 1;
ERROR: failed to find conversion function from unknown to text

Furthermore, it seems like a seriously bad idea for "SELECT 'foo'" to mean
something different in the context of CREATE VIEW than it means elsewhere.

The trick is to not break cases where we've already hacked things to allow
external influence on the resolved types of SELECT-list items. AFAICT,
these cases are just INSERT/SELECT and set operations (eg unions):

regression=# create table target (f1 int);
CREATE TABLE
regression=# explain verbose insert into target select '42' from generate_series(1,3);
QUERY PLAN

--------------------------------------------------------------------------------
---------
Insert on public.target (cost=0.00..10.00 rows=1000 width=4)
-> Function Scan on pg_catalog.generate_series (cost=0.00..10.00 rows=1000
width=4)
Output: 42
Function Call: generate_series(1, 3)
(4 rows)

regression=# explain verbose select '42' union all select 43;
QUERY PLAN
------------------------------------------------
Append (cost=0.00..0.04 rows=2 width=4)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: 42
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: 43
(5 rows)

In both of the above cases, we're getting an integer constant not a
text or "unknown" constant, because the type information gets imported
from outside the "SELECT".

I spent some time fooling with this today and came up with the attached
patch. I think this is basically the direction we should go in, but
there are various loose ends:

1. I adjusted a couple of existing regression tests whose results are
affected, but didn't do anything towards adding new tests showing the
desirable results of this change (as per the examples in this thread).

2. I didn't do anything about docs, either, though maybe no change
is needed. One user-visible change from this is that queries should
never return any "unknown"-type columns to clients anymore. But I
think that is not documented now, so maybe there's nothing to change.

3. We need to look at whether pg_upgrade is affected. I think we
might be able to let it just ignore the issue for views, since they'd
get properly updated during the dump and reload anyway. But if someone
had an actual table (or matview) with an "unknown" column, that would
be a pg_upgrade hazard, because "unknown" doesn't store like "text".

4. If "unknown" were marked as a pseudotype not base type in pg_type
(ie, typtype "p" not "b") then the special case for it in
CheckAttributeType could go away completely. I'm not really sure
why it's "b" today --- maybe specifically because of this point that
it's currently possible to create tables with unknown columns? But
I've not looked at what all the implications of changing that might
be, and anyway it could be left for a followon patch.

5. I noticed that the "resolveUnknown" arguments of transformSortClause
and other functions in parse_clause.c could go away: there's really no
reason not to just treat them as "true" always. But that could be
separate cleanup as well.

Anybody want to hack on those loose ends?

regards, tom lane

#15

michael@paquier.xyz

over 9 years ago

In reply to: Tom Lane (#14)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

On Sun, Jan 8, 2017 at 10:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:

On Tue, Jan 3, 2017 at 5:57 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:

Are you suggesting extending the patch to include coercing from unknown to
text for all possible cases where a column of unknown type is being created?

I guess, that's what Tom is suggesting.

Yes; I think the point is we should change the semantics we assume for
"SELECT 'foo'", not only for views but across the board. There are plenty
of non-view-related cases where it doesn't behave well, for example

regression=# select * from
(select 'foo' from generate_series(1,3)) ss order by 1;
ERROR: failed to find conversion function from unknown to text

Furthermore, it seems like a seriously bad idea for "SELECT 'foo'" to mean
something different in the context of CREATE VIEW than it means elsewhere.

And this offers the same semantics for any DDL using SELECT's target
list to build the list of column's types, which is consistent.

The trick is to not break cases where we've already hacked things to allow
external influence on the resolved types of SELECT-list items. AFAICT,
these cases are just INSERT/SELECT and set operations (eg unions):

regression=# create table target (f1 int);
CREATE TABLE
regression=# explain verbose insert into target select '42' from generate_series(1,3);
QUERY PLAN

--------------------------------------------------------------------------------
---------
Insert on public.target (cost=0.00..10.00 rows=1000 width=4)
-> Function Scan on pg_catalog.generate_series (cost=0.00..10.00 rows=1000
width=4)
Output: 42
Function Call: generate_series(1, 3)
(4 rows)

regression=# explain verbose select '42' union all select 43;
QUERY PLAN
------------------------------------------------
Append (cost=0.00..0.04 rows=2 width=4)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: 42
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: 43
(5 rows)

In both of the above cases, we're getting an integer constant not a
text or "unknown" constant, because the type information gets imported
from outside the "SELECT".

I spent some time fooling with this today and came up with the attached
patch. I think this is basically the direction we should go in, but
there are various loose ends:

1. I adjusted a couple of existing regression tests whose results are
affected, but didn't do anything towards adding new tests showing the
desirable results of this change (as per the examples in this thread).

2. I didn't do anything about docs, either, though maybe no change
is needed. One user-visible change from this is that queries should
never return any "unknown"-type columns to clients anymore. But I
think that is not documented now, so maybe there's nothing to change.

3. We need to look at whether pg_upgrade is affected. I think we
might be able to let it just ignore the issue for views, since they'd
get properly updated during the dump and reload anyway. But if someone
had an actual table (or matview) with an "unknown" column, that would
be a pg_upgrade hazard, because "unknown" doesn't store like "text".

4. If "unknown" were marked as a pseudotype not base type in pg_type
(ie, typtype "p" not "b") then the special case for it in
CheckAttributeType could go away completely. I'm not really sure
why it's "b" today --- maybe specifically because of this point that
it's currently possible to create tables with unknown columns? But
I've not looked at what all the implications of changing that might
be, and anyway it could be left for a followon patch.

5. I noticed that the "resolveUnknown" arguments of transformSortClause
and other functions in parse_clause.c could go away: there's really no
reason not to just treat them as "true" always. But that could be
separate cleanup as well.

Anybody want to hack on those loose ends?

Ashutosh, Rahila, do you have plans to review things here?
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

ashutosh.bapat@enterprisedb.com

over 9 years ago

In reply to: Michael Paquier (#15)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

On Wed, Jan 18, 2017 at 10:55 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Sun, Jan 8, 2017 at 10:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:

On Tue, Jan 3, 2017 at 5:57 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:

Are you suggesting extending the patch to include coercing from unknown to
text for all possible cases where a column of unknown type is being created?

I guess, that's what Tom is suggesting.

Yes; I think the point is we should change the semantics we assume for
"SELECT 'foo'", not only for views but across the board. There are plenty
of non-view-related cases where it doesn't behave well, for example

regression=# select * from
(select 'foo' from generate_series(1,3)) ss order by 1;
ERROR: failed to find conversion function from unknown to text

Furthermore, it seems like a seriously bad idea for "SELECT 'foo'" to mean
something different in the context of CREATE VIEW than it means elsewhere.

And this offers the same semantics for any DDL using SELECT's target
list to build the list of column's types, which is consistent.

The trick is to not break cases where we've already hacked things to allow
external influence on the resolved types of SELECT-list items. AFAICT,
these cases are just INSERT/SELECT and set operations (eg unions):

regression=# create table target (f1 int);
CREATE TABLE
regression=# explain verbose insert into target select '42' from generate_series(1,3);
QUERY PLAN

--------------------------------------------------------------------------------
---------
Insert on public.target (cost=0.00..10.00 rows=1000 width=4)
-> Function Scan on pg_catalog.generate_series (cost=0.00..10.00 rows=1000
width=4)
Output: 42
Function Call: generate_series(1, 3)
(4 rows)

regression=# explain verbose select '42' union all select 43;
QUERY PLAN
------------------------------------------------
Append (cost=0.00..0.04 rows=2 width=4)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: 42
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: 43
(5 rows)

In both of the above cases, we're getting an integer constant not a
text or "unknown" constant, because the type information gets imported
from outside the "SELECT".

I spent some time fooling with this today and came up with the attached
patch. I think this is basically the direction we should go in, but
there are various loose ends:

1. I adjusted a couple of existing regression tests whose results are
affected, but didn't do anything towards adding new tests showing the
desirable results of this change (as per the examples in this thread).

2. I didn't do anything about docs, either, though maybe no change
is needed. One user-visible change from this is that queries should
never return any "unknown"-type columns to clients anymore. But I
think that is not documented now, so maybe there's nothing to change.

3. We need to look at whether pg_upgrade is affected. I think we
might be able to let it just ignore the issue for views, since they'd
get properly updated during the dump and reload anyway. But if someone
had an actual table (or matview) with an "unknown" column, that would
be a pg_upgrade hazard, because "unknown" doesn't store like "text".

4. If "unknown" were marked as a pseudotype not base type in pg_type
(ie, typtype "p" not "b") then the special case for it in
CheckAttributeType could go away completely. I'm not really sure
why it's "b" today --- maybe specifically because of this point that
it's currently possible to create tables with unknown columns? But
I've not looked at what all the implications of changing that might
be, and anyway it could be left for a followon patch.

5. I noticed that the "resolveUnknown" arguments of transformSortClause
and other functions in parse_clause.c could go away: there's really no
reason not to just treat them as "true" always. But that could be
separate cleanup as well.

Anybody want to hack on those loose ends?

Ashutosh, Rahila, do you have plans to review things here?

I won't be able to work on creating patches for TODOs listed by Tom.
But I can review if someone volunteers to produce the patches.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Tom Lane (#14)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

I wrote:

I spent some time fooling with this today and came up with the attached
patch. I think this is basically the direction we should go in, but
there are various loose ends:

Here's an updated patch that's rebased against today's HEAD and addresses
most of the loose ends:

1. I adjusted a couple of existing regression tests whose results are
affected, but didn't do anything towards adding new tests showing the
desirable results of this change (as per the examples in this thread).

Added some regression test cases. These are mostly designed to prove
that coercion to text occurs where expected, but I did include a couple
of queries that would have failed outright before.

2. I didn't do anything about docs, either, though maybe no change
is needed. One user-visible change from this is that queries should
never return any "unknown"-type columns to clients anymore. But I
think that is not documented now, so maybe there's nothing to change.

The only thing I could find in the SGML docs that directly addresses
unknown-type columns was a remark in the CREATE VIEW man page, which
I've updated. I also added a section to typeconv.sgml to document
the behavior.

3. We need to look at whether pg_upgrade is affected. I think we
might be able to let it just ignore the issue for views, since they'd
get properly updated during the dump and reload anyway. But if someone
had an actual table (or matview) with an "unknown" column, that would
be a pg_upgrade hazard, because "unknown" doesn't store like "text".

I tested and found that simple views with unknown columns seem to update
sanely if we just let pg_upgrade dump and restore them. So I suggest we
allow that to happen. There might be cases where dependent views behave
unexpectedly after such a conversion, but I think that would be rare
enough that it's not worth forcing users to fix these views by hand before
updating. However, tables with unknown columns would fail the upgrade
(since we'd reject the CREATE TABLE command) while matviews would be
accepted but then the DDL wouldn't match the physical data storage.
So I added code to pg_upgrade to check for those cases and refuse to
proceed. This is almost a straight copy-and-paste of the existing
pg_upgrade code for checking for "line" columns.

I think this is committable now; the other loose ends can be dealt
with in follow-on patches. Does anyone want to review it?

regards, tom lane

#18

michael@paquier.xyz

over 9 years ago

In reply to: Tom Lane (#17)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

On Mon, Jan 23, 2017 at 7:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

2. I didn't do anything about docs, either, though maybe no change
is needed. One user-visible change from this is that queries should
never return any "unknown"-type columns to clients anymore. But I
think that is not documented now, so maybe there's nothing to change.

The only thing I could find in the SGML docs that directly addresses
unknown-type columns was a remark in the CREATE VIEW man page, which
I've updated. I also added a section to typeconv.sgml to document
the behavior.

This looks good.

3. We need to look at whether pg_upgrade is affected. I think we
might be able to let it just ignore the issue for views, since they'd
get properly updated during the dump and reload anyway. But if someone
had an actual table (or matview) with an "unknown" column, that would
be a pg_upgrade hazard, because "unknown" doesn't store like "text".

I tested and found that simple views with unknown columns seem to update
sanely if we just let pg_upgrade dump and restore them. So I suggest we
allow that to happen. There might be cases where dependent views behave
unexpectedly after such a conversion, but I think that would be rare
enough that it's not worth forcing users to fix these views by hand before
updating. However, tables with unknown columns would fail the upgrade
(since we'd reject the CREATE TABLE command) while matviews would be
accepted but then the DDL wouldn't match the physical data storage.
So I added code to pg_upgrade to check for those cases and refuse to
proceed. This is almost a straight copy-and-paste of the existing
pg_upgrade code for checking for "line" columns.

I think this is committable now; the other loose ends can be dealt
with in follow-on patches. Does anyone want to review it?

I have spent a couple of hours looking at it, and it looks in pretty
good shape. The concept of doing the checks in the parser is far
cleaner than what was proposed upthread to tweak the column lists, and
more generic.

One thing though: even with this patch, it is still possible to define
a domain with unknown as underlying type and have a table grab it:
create domain name as unknown;
create table foo_name (a name);
I think that this case should be restricted as well, and pg_upgrade
had better complain with a lookup at typbasetype in pg_type.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

ashutosh.bapat@enterprisedb.com

over 9 years ago

In reply to: Tom Lane (#17)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

On Mon, Jan 23, 2017 at 4:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

I spent some time fooling with this today and came up with the attached
patch. I think this is basically the direction we should go in, but
there are various loose ends:

Here's an updated patch that's rebased against today's HEAD and addresses
most of the loose ends:

1. I adjusted a couple of existing regression tests whose results are
affected, but didn't do anything towards adding new tests showing the
desirable results of this change (as per the examples in this thread).

Added some regression test cases. These are mostly designed to prove
that coercion to text occurs where expected, but I did include a couple
of queries that would have failed outright before.

+1. Thanks.

2. I didn't do anything about docs, either, though maybe no change
is needed. One user-visible change from this is that queries should
never return any "unknown"-type columns to clients anymore. But I
think that is not documented now, so maybe there's nothing to change.

The only thing I could find in the SGML docs that directly addresses
unknown-type columns was a remark in the CREATE VIEW man page, which
I've updated. I also added a section to typeconv.sgml to document
the behavior.

3. We need to look at whether pg_upgrade is affected. I think we
might be able to let it just ignore the issue for views, since they'd
get properly updated during the dump and reload anyway. But if someone
had an actual table (or matview) with an "unknown" column, that would
be a pg_upgrade hazard, because "unknown" doesn't store like "text".

I tested and found that simple views with unknown columns seem to update
sanely if we just let pg_upgrade dump and restore them. So I suggest we
allow that to happen. There might be cases where dependent views behave
unexpectedly after such a conversion, but I think that would be rare
enough that it's not worth forcing users to fix these views by hand before
updating. However, tables with unknown columns would fail the upgrade
(since we'd reject the CREATE TABLE command) while matviews would be
accepted but then the DDL wouldn't match the physical data storage.
So I added code to pg_upgrade to check for those cases and refuse to
proceed. This is almost a straight copy-and-paste of the existing
pg_upgrade code for checking for "line" columns.

Following error message might be misleading,

postgres=# create table t1 (a unknown);
ERROR: column "a" has pseudo-type unknown

UNKNOWN is not exactly a pseudo-type.

postgres=# select typname, typtype from pg_type where typname =
'unknown' or typname = 'any';
typname | typtype
---------+---------
unknown | b
any | p
(2 rows)

In your earlier mail, you had raised the point about marking unknown
as a pseudo-type. But till we actually do that, it would be better not
to mention 'unknown' as a pseudo-type.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Michael Paquier (#18)

Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

Michael Paquier <michael.paquier@gmail.com> writes:

One thing though: even with this patch, it is still possible to define
a domain with unknown as underlying type and have a table grab it:
create domain name as unknown;
create table foo_name (a name);
I think that this case should be restricted as well, and pg_upgrade
had better complain with a lookup at typbasetype in pg_type.

Yeah, this is the sort of corner case that I think we ought to plug
by turning "unknown" into a true pseudotype. DefineDomain already
has a defense against that:

regression=# create domain d2 as anyelement;
ERROR: "anyelement" is not a valid base type for a domain
regression=# \errverbose
ERROR: 42804: "anyelement" is not a valid base type for a domain
LOCATION: DefineDomain, typecmds.c:812

Some prodding indicates that the PLs are not uniformly preventing
making functions with unknown input or result type, either.
That's another case that would be less likely to get missed if it
were a true pseudotype.

However, I think that's all material for a separate patch. This
patch is just concerned with what the main parser should do with
"unknown", not with what utility commands should do with it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Ashutosh Bapat (#19)

#22

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Tom Lane (#21)

#23