SELECT INTO deprecation
While reading about deprecating and removing various things in other
threads, I was wondering about how deprecated SELECT INTO is. There are
various source code comments about this, but the SELECT INTO reference
page only contains soft language like "recommended". I'm proposing the
attached patch to stick a more explicit deprecation notice right at the top.
I also found some gratuitous uses of SELECT INTO in various tests and
documentation (not ecpg or plpgsql of course). Here is a patch to
adjust those to CREATE TABLE AS.
I don't have a specific plan for removing top-level SELECT INTO
altogether, but there is a nontrivial amount of code for handling it, so
there would be some gain if it could be removed eventually.
Attachments:
0001-Remove-gratuitous-uses-of-deprecated-SELECT-INTO.patchtext/plain; charset=UTF-8; name=0001-Remove-gratuitous-uses-of-deprecated-SELECT-INTO.patch; x-mac-creator=0; x-mac-type=0Download
From a3ae4248340317ac1580c257472e1361841bd4b2 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 2 Dec 2020 12:09:39 +0100
Subject: [PATCH 1/2] Remove gratuitous uses of deprecated SELECT INTO
CREATE TABLE AS has been preferred over SELECT INTO (outside of ecpg
and PL/pgSQL) for a long time. There were still a few uses of SELECT
INTO in tests and documentation, some old, some more recent. This
changes them to CREATE TABLE AS. Some occurrences in the tests remain
where they are specifically testing SELECT INTO parsing or similar.
---
contrib/sepgsql/expected/label.out | 2 +-
contrib/sepgsql/sql/label.sql | 2 +-
doc/src/sgml/hstore.sgml | 2 +-
src/bin/pg_basebackup/t/010_pg_basebackup.pl | 4 ++--
src/bin/pg_checksums/t/002_actions.pl | 2 +-
src/test/regress/expected/create_index.out | 2 +-
src/test/regress/expected/create_misc.out | 2 +-
src/test/regress/expected/random.out | 3 ++-
src/test/regress/expected/select_implicit.out | 6 ++++--
src/test/regress/expected/select_implicit_1.out | 6 ++++--
src/test/regress/expected/select_implicit_2.out | 6 ++++--
src/test/regress/sql/create_index.sql | 2 +-
src/test/regress/sql/create_misc.sql | 2 +-
src/test/regress/sql/random.sql | 3 ++-
src/test/regress/sql/select_implicit.sql | 6 ++++--
15 files changed, 30 insertions(+), 20 deletions(-)
diff --git a/contrib/sepgsql/expected/label.out b/contrib/sepgsql/expected/label.out
index 0300bc6fb4..b1b7db55f6 100644
--- a/contrib/sepgsql/expected/label.out
+++ b/contrib/sepgsql/expected/label.out
@@ -6,7 +6,7 @@
--
CREATE TABLE t1 (a int, b text);
INSERT INTO t1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
-SELECT * INTO t2 FROM t1 WHERE a % 2 = 0;
+CREATE TABLE t2 AS SELECT * FROM t1 WHERE a % 2 = 0;
CREATE FUNCTION f1 () RETURNS text
AS 'SELECT sepgsql_getcon()'
LANGUAGE sql;
diff --git a/contrib/sepgsql/sql/label.sql b/contrib/sepgsql/sql/label.sql
index d19c6edb4c..76e261bee8 100644
--- a/contrib/sepgsql/sql/label.sql
+++ b/contrib/sepgsql/sql/label.sql
@@ -7,7 +7,7 @@
--
CREATE TABLE t1 (a int, b text);
INSERT INTO t1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
-SELECT * INTO t2 FROM t1 WHERE a % 2 = 0;
+CREATE TABLE t2 AS SELECT * FROM t1 WHERE a % 2 = 0;
CREATE FUNCTION f1 () RETURNS text
AS 'SELECT sepgsql_getcon()'
diff --git a/doc/src/sgml/hstore.sgml b/doc/src/sgml/hstore.sgml
index 14a36ade00..25904d9562 100644
--- a/doc/src/sgml/hstore.sgml
+++ b/doc/src/sgml/hstore.sgml
@@ -841,7 +841,7 @@ <title>Statistics</title>
<para>
Using a table:
<programlisting>
-SELECT (each(h)).key, (each(h)).value INTO stat FROM testhstore;
+CREATE TABLE stat AS SELECT (each(h)).key, (each(h)).value FROM testhstore;
</programlisting>
</para>
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index f674a7c94e..9eba7d8d7d 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -502,10 +502,10 @@
# create tables to corrupt and get their relfilenodes
my $file_corrupt1 = $node->safe_psql('postgres',
- q{SELECT a INTO corrupt1 FROM generate_series(1,10000) AS a; ALTER TABLE corrupt1 SET (autovacuum_enabled=false); SELECT pg_relation_filepath('corrupt1')}
+ q{CREATE TABLE corrupt1 AS SELECT a FROM generate_series(1,10000) AS a; ALTER TABLE corrupt1 SET (autovacuum_enabled=false); SELECT pg_relation_filepath('corrupt1')}
);
my $file_corrupt2 = $node->safe_psql('postgres',
- q{SELECT b INTO corrupt2 FROM generate_series(1,2) AS b; ALTER TABLE corrupt2 SET (autovacuum_enabled=false); SELECT pg_relation_filepath('corrupt2')}
+ q{CREATE TABLE corrupt2 AS SELECT b FROM generate_series(1,2) AS b; ALTER TABLE corrupt2 SET (autovacuum_enabled=false); SELECT pg_relation_filepath('corrupt2')}
);
# set page header and block sizes
diff --git a/src/bin/pg_checksums/t/002_actions.pl b/src/bin/pg_checksums/t/002_actions.pl
index 4e4934532a..8a81f36a06 100644
--- a/src/bin/pg_checksums/t/002_actions.pl
+++ b/src/bin/pg_checksums/t/002_actions.pl
@@ -21,7 +21,7 @@ sub check_relation_corruption
$node->safe_psql(
'postgres',
- "SELECT a INTO $table FROM generate_series(1,10000) AS a;
+ "CREATE TABLE $table AS SELECT a FROM generate_series(1,10000) AS a;
ALTER TABLE $table SET (autovacuum_enabled=false);");
$node->safe_psql('postgres',
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 18bb92b810..5fb8c48e95 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -1582,7 +1582,7 @@ DROP TABLE syscol_table;
--
-- Tests for IS NULL/IS NOT NULL with b-tree indexes
--
-SELECT unique1, unique2 INTO onek_with_null FROM onek;
+CREATE TABLE onek_with_null AS SELECT unique1, unique2 FROM onek;
INSERT INTO onek_with_null (unique1,unique2) VALUES (NULL, -1), (NULL, NULL);
CREATE UNIQUE INDEX onek_nulltest ON onek_with_null (unique2,unique1);
SET enable_seqscan = OFF;
diff --git a/src/test/regress/expected/create_misc.out b/src/test/regress/expected/create_misc.out
index cee35ed02f..41bc4d7750 100644
--- a/src/test/regress/expected/create_misc.out
+++ b/src/test/regress/expected/create_misc.out
@@ -5,7 +5,7 @@
-- (any resemblance to real life is purely coincidental)
--
INSERT INTO tenk2 SELECT * FROM tenk1;
-SELECT * INTO TABLE onek2 FROM onek;
+CREATE TABLE onek2 AS SELECT * FROM onek;
INSERT INTO fast_emp4000 SELECT * FROM slow_emp4000;
SELECT *
INTO TABLE Bprime
diff --git a/src/test/regress/expected/random.out b/src/test/regress/expected/random.out
index 302c3d61c7..a919b28d8d 100644
--- a/src/test/regress/expected/random.out
+++ b/src/test/regress/expected/random.out
@@ -23,7 +23,8 @@ INTERSECT
(0 rows)
-- count roughly 1/10 of the tuples
-SELECT count(*) AS random INTO RANDOM_TBL
+CREATE TABLE RANDOM_TBL AS
+ SELECT count(*) AS random
FROM onek WHERE random() < 1.0/10;
-- select again, the count should be different
INSERT INTO RANDOM_TBL (random)
diff --git a/src/test/regress/expected/select_implicit.out b/src/test/regress/expected/select_implicit.out
index 61b485fdaa..27c07de92c 100644
--- a/src/test/regress/expected/select_implicit.out
+++ b/src/test/regress/expected/select_implicit.out
@@ -202,7 +202,8 @@ SELECT count(*) FROM test_missing_target x, test_missing_target y
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(*) INTO TABLE test_missing_target2
+CREATE TABLE test_missing_target2 AS
+SELECT count(*)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b ORDER BY x.b;
@@ -318,7 +319,8 @@ LINE 1: SELECT count(b) FROM test_missing_target x, test_missing_tar...
^
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(x.b) INTO TABLE test_missing_target3
+CREATE TABLE test_missing_target3 AS
+SELECT count(x.b)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b/2 ORDER BY x.b/2;
diff --git a/src/test/regress/expected/select_implicit_1.out b/src/test/regress/expected/select_implicit_1.out
index f277375ebf..d67521e8f8 100644
--- a/src/test/regress/expected/select_implicit_1.out
+++ b/src/test/regress/expected/select_implicit_1.out
@@ -202,7 +202,8 @@ SELECT count(*) FROM test_missing_target x, test_missing_target y
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(*) INTO TABLE test_missing_target2
+CREATE TABLE test_missing_target2 AS
+SELECT count(*)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b ORDER BY x.b;
@@ -318,7 +319,8 @@ LINE 1: SELECT count(b) FROM test_missing_target x, test_missing_tar...
^
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(x.b) INTO TABLE test_missing_target3
+CREATE TABLE test_missing_target3 AS
+SELECT count(x.b)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b/2 ORDER BY x.b/2;
diff --git a/src/test/regress/expected/select_implicit_2.out b/src/test/regress/expected/select_implicit_2.out
index 91c3a24f92..7a353d0862 100644
--- a/src/test/regress/expected/select_implicit_2.out
+++ b/src/test/regress/expected/select_implicit_2.out
@@ -202,7 +202,8 @@ SELECT count(*) FROM test_missing_target x, test_missing_target y
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(*) INTO TABLE test_missing_target2
+CREATE TABLE test_missing_target2 AS
+SELECT count(*)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b ORDER BY x.b;
@@ -318,7 +319,8 @@ LINE 1: SELECT count(b) FROM test_missing_target x, test_missing_tar...
^
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(x.b) INTO TABLE test_missing_target3
+CREATE TABLE test_missing_target3 AS
+SELECT count(x.b)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b/2 ORDER BY x.b/2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 55326eb47b..331670962e 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -609,7 +609,7 @@ CREATE INDEX ON syscol_table (a) WHERE ctid >= '(1000,0)';
-- Tests for IS NULL/IS NOT NULL with b-tree indexes
--
-SELECT unique1, unique2 INTO onek_with_null FROM onek;
+CREATE TABLE onek_with_null AS SELECT unique1, unique2 FROM onek;
INSERT INTO onek_with_null (unique1,unique2) VALUES (NULL, -1), (NULL, NULL);
CREATE UNIQUE INDEX onek_nulltest ON onek_with_null (unique2,unique1);
diff --git a/src/test/regress/sql/create_misc.sql b/src/test/regress/sql/create_misc.sql
index d0b04a821f..c7d0d064c3 100644
--- a/src/test/regress/sql/create_misc.sql
+++ b/src/test/regress/sql/create_misc.sql
@@ -8,7 +8,7 @@
INSERT INTO tenk2 SELECT * FROM tenk1;
-SELECT * INTO TABLE onek2 FROM onek;
+CREATE TABLE onek2 AS SELECT * FROM onek;
INSERT INTO fast_emp4000 SELECT * FROM slow_emp4000;
diff --git a/src/test/regress/sql/random.sql b/src/test/regress/sql/random.sql
index ae6b70a157..8187b2c288 100644
--- a/src/test/regress/sql/random.sql
+++ b/src/test/regress/sql/random.sql
@@ -17,7 +17,8 @@
FROM onek ORDER BY random() LIMIT 1);
-- count roughly 1/10 of the tuples
-SELECT count(*) AS random INTO RANDOM_TBL
+CREATE TABLE RANDOM_TBL AS
+ SELECT count(*) AS random
FROM onek WHERE random() < 1.0/10;
-- select again, the count should be different
diff --git a/src/test/regress/sql/select_implicit.sql b/src/test/regress/sql/select_implicit.sql
index d815504222..de3aef8d81 100644
--- a/src/test/regress/sql/select_implicit.sql
+++ b/src/test/regress/sql/select_implicit.sql
@@ -86,7 +86,8 @@ CREATE TABLE test_missing_target (a int, b int, c char(8), d char);
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(*) INTO TABLE test_missing_target2
+CREATE TABLE test_missing_target2 AS
+SELECT count(*)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b ORDER BY x.b;
@@ -142,7 +143,8 @@ CREATE TABLE test_missing_target (a int, b int, c char(8), d char);
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(x.b) INTO TABLE test_missing_target3
+CREATE TABLE test_missing_target3 AS
+SELECT count(x.b)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b/2 ORDER BY x.b/2;
--
2.29.2
0002-doc-Add-stronger-deprecation-language-to-SELECT-INTO.patchtext/plain; charset=UTF-8; name=0002-doc-Add-stronger-deprecation-language-to-SELECT-INTO.patch; x-mac-creator=0; x-mac-type=0Download
From a7ddf18d2c89005ad6035cf1f40387ca7e949567 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 2 Dec 2020 12:47:03 +0100
Subject: [PATCH 2/2] doc: Add stronger deprecation language to SELECT INTO
reference page
Even though SELECT INTO (not ecpg, not PL/pgSQL) is widely recognized
among insiders to be deprecated, and there are some source code
comments about this, the SELECT INTO reference page only contains
vague language like "recommended". This change puts a deprecation
notice right at the top.
---
doc/src/sgml/ref/select_into.sgml | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/doc/src/sgml/ref/select_into.sgml b/doc/src/sgml/ref/select_into.sgml
index 7b327d9eee..b37ac345b0 100644
--- a/doc/src/sgml/ref/select_into.sgml
+++ b/doc/src/sgml/ref/select_into.sgml
@@ -42,6 +42,12 @@
<refsect1>
<title>Description</title>
+ <para>
+ This command is deprecated. Use <link
+ linkend="sql-createtableas"><command>CREATE TABLE AS</command></link>
+ instead.
+ </para>
+
<para>
<command>SELECT INTO</command> creates a new table and fills it
with data computed by a query. The data is not returned to the
--
2.29.2
On 2 Dec 2020, at 12:54, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
While reading about deprecating and removing various things in other threads, I was wondering about how deprecated SELECT INTO is. There are various source code comments about this, but the SELECT INTO reference page only contains soft language like "recommended". I'm proposing the attached patch to stick a more explicit deprecation notice right at the top.
+ This command is deprecated. Use <link
Should this get similar strong wording to other deprecated things where we add
"..and may/will eventually be removed"?
I also found some gratuitous uses of SELECT INTO in various tests and documentation (not ecpg or plpgsql of course). Here is a patch to adjust those to CREATE TABLE AS.
I didn't scan for others, but the ones included in the 0001 patch all looks
fine and IMO improves readability.
cheers ./daniel
st 2. 12. 2020 v 12:55 odesílatel Peter Eisentraut <
peter.eisentraut@enterprisedb.com> napsal:
While reading about deprecating and removing various things in other
threads, I was wondering about how deprecated SELECT INTO is. There are
various source code comments about this, but the SELECT INTO reference
page only contains soft language like "recommended". I'm proposing the
attached patch to stick a more explicit deprecation notice right at the
top.I also found some gratuitous uses of SELECT INTO in various tests and
documentation (not ecpg or plpgsql of course). Here is a patch to
adjust those to CREATE TABLE AS.I don't have a specific plan for removing top-level SELECT INTO
altogether, but there is a nontrivial amount of code for handling it, so
there would be some gain if it could be removed eventually.
+1
Pavel
Greetings,
* Peter Eisentraut (peter.eisentraut@enterprisedb.com) wrote:
While reading about deprecating and removing various things in other
threads, I was wondering about how deprecated SELECT INTO is. There are
various source code comments about this, but the SELECT INTO reference page
only contains soft language like "recommended". I'm proposing the attached
patch to stick a more explicit deprecation notice right at the top.
I don't see much value in this. Users already have 5 years to adapt
their code to new major versions of PG and that strikes me as plenty
enough time and is why we support multiple major versions of PG for so
long. Users who keep pace and update for each major version aren't
likely to have issue making this change since they're already used to
regularly updating their code for new major versions, while others are
going to complain no matter when we remove it and will ignore any
deprecation notices we put out there, so there isn't much point in them.
I also found some gratuitous uses of SELECT INTO in various tests and
documentation (not ecpg or plpgsql of course). Here is a patch to adjust
those to CREATE TABLE AS.
If we aren't actually removing SELECT INTO then I don't know that it
makes sense to just stop testing it.
I don't have a specific plan for removing top-level SELECT INTO altogether,
but there is a nontrivial amount of code for handling it, so there would be
some gain if it could be removed eventually.
We should either remove it, or remove the comments that it's deprecated,
not try to make it more deprecated or try to somehow increase the
recommendation to not use it.
I'd recommend we remove it and make note that it's been removed in v14
in the back branches at the same time, which will give those who
actually pay attention and care that much more time before v14 comes out
to update their code (not that I'm all that worried, as they'll also see
it in the beta release notes too).
Thanks,
Stephen
On Wed, Dec 02, 2020 at 12:58:36PM -0500, Stephen Frost wrote:
Greetings,
* Peter Eisentraut (peter.eisentraut@enterprisedb.com) wrote:
While reading about deprecating and removing various things in
other threads, I was wondering about how deprecated SELECT INTO
is. There are various source code comments about this, but the
SELECT INTO reference page only contains soft language like
"recommended". I'm proposing the attached patch to stick a more
explicit deprecation notice right at the top.I don't see much value in this. Users already have 5 years to adapt
their code to new major versions of PG and that strikes me as plenty
enough time and is why we support multiple major versions of PG for
so long. Users who keep pace and update for each major version
aren't likely to have issue making this change since they're already
used to regularly updating their code for new major versions, while
others are going to complain no matter when we remove it and will
ignore any deprecation notices we put out there, so there isn't much
point in them.
+1 for removing it entirely and including this prominently in the
release notes.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Stephen Frost <sfrost@snowman.net> writes:
* Peter Eisentraut (peter.eisentraut@enterprisedb.com) wrote:
While reading about deprecating and removing various things in other
threads, I was wondering about how deprecated SELECT INTO is. There are
various source code comments about this, but the SELECT INTO reference page
only contains soft language like "recommended". I'm proposing the attached
patch to stick a more explicit deprecation notice right at the top.
I don't see much value in this.
Yeah, if we want to kill it let's just do so. The negative language in
the reference page has been there since (at least) 7.1, so people can
hardly say they didn't see it coming.
regards, tom lane
On Wed, Dec 02, 2020 at 03:35:46PM -0500, Tom Lane wrote:
Yeah, if we want to kill it let's just do so. The negative language in
the reference page has been there since (at least) 7.1, so people can
hardly say they didn't see it coming.
+1. I got to wonder about the impact when migrating applications
though. SELECT INTO has a different meaning in Oracle, but SQL server
creates a new table like Postgres.
--
Michael
Stephen Frost schrieb am 02.12.2020 um 18:58:
We should either remove it, or remove the comments that it's deprecated,
not try to make it more deprecated or try to somehow increase the
recommendation to not use it.
(I am writing from a "user only" perspective, not a developer)
I don't see any warning about the syntax being "deprecated" in the current manual.
There is only a note that says that CTAS is "recommended" instead of SELET INTO,
but for me that's something entirely different than "deprecating" it.
I personally have nothing against removing it, but I still see it used
a lot in questions on various online forums, and I would think that
a lot of people would be very unpleasantly surprised if a feature
gets removed without any warning (the current "recommendation" does not
constitute a deprecation or even removal warning for most people I guess)
I would vote for a clear deprecation message as suggested by Peter, but I would
add "and will be removed in a future version" to it.
Not sure if maybe even back-patching that warning would make sense as well, so
that also users of older versions get to see that warning.
Then target 15 or 16 as the release for removal, but not 14
Thomas
On 2020-12-02 18:58, Stephen Frost wrote:
I also found some gratuitous uses of SELECT INTO in various tests and
documentation (not ecpg or plpgsql of course). Here is a patch to adjust
those to CREATE TABLE AS.If we aren't actually removing SELECT INTO then I don't know that it
makes sense to just stop testing it.
The point here was, there is still code that actually tests SELECT INTO
specifically. But unrelated test code that just wants to set up a quick
table with some rows in it ought to use the preferred syntax for doing so.
On 2020-12-03 00:54, Michael Paquier wrote:
I got to wonder about the impact when migrating applications
though. SELECT INTO has a different meaning in Oracle, but SQL server
creates a new table like Postgres.
Interesting. This appears to be the case. SQL Server uses SELECT INTO
to create a table, and does not appear to have CREATE TABLE AS.
So maybe we should keep it, but adjust the documentation to point out
this use case.
[some snarky comment about AWS Babelfish here ... ;-) ]
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
Interesting. This appears to be the case. SQL Server uses SELECT INTO
to create a table, and does not appear to have CREATE TABLE AS.
So maybe we should keep it, but adjust the documentation to point out
this use case.
That argument makes sense, but only if our version is a drop-in
replacement for SQL Server's version: if people have to adjust their
commands anyway in corner cases, we're not doing them any big favor.
So: are the syntax and semantics really a match? Do we have feature
parity?
As I recall, a whole lot of the pain we have with INTO has to do
with the semantics we've chosen for INTO in a set-operation nest.
We think you can write something like
SELECT ... INTO foo FROM ... UNION SELECT ... FROM ...
but we insist on the INTO being in the first component SELECT.
I'd like to know exactly how much of that messiness is shared
by SQL Server.
(FWIW, I think the fact that SELECT INTO means something entirely
different in plpgsql is a good reason for killing off one version
or the other. As things stand, it's mighty confusing.)
regards, tom lane
On 2020-12-03 16:34, Tom Lane wrote:
As I recall, a whole lot of the pain we have with INTO has to do
with the semantics we've chosen for INTO in a set-operation nest.
We think you can write something likeSELECT ... INTO foo FROM ... UNION SELECT ... FROM ...
but we insist on the INTO being in the first component SELECT.
I'd like to know exactly how much of that messiness is shared
by SQL Server.
On sqlfiddle.com, this works:
select a into t3 from t1 union select a from t2;
but this gets an error:
select a from t1 union select a into t4 from t2;
SELECT INTO must be the first query in a statement containing a UNION,
INTERSECT or EXCEPT operator.
On 2020-12-03 20:26, Peter Eisentraut wrote:
On 2020-12-03 16:34, Tom Lane wrote:
As I recall, a whole lot of the pain we have with INTO has to do
with the semantics we've chosen for INTO in a set-operation nest.
We think you can write something likeSELECT ... INTO foo FROM ... UNION SELECT ... FROM ...
but we insist on the INTO being in the first component SELECT.
I'd like to know exactly how much of that messiness is shared
by SQL Server.On sqlfiddle.com, this works:
select a into t3 from t1 union select a from t2;
but this gets an error:
select a from t1 union select a into t4 from t2;
SELECT INTO must be the first query in a statement containing a UNION,
INTERSECT or EXCEPT operator.
So, with that in mind, here is an alternative proposal that points out
that SELECT INTO has some use for compatibility.
Attachments:
v2-0001-Remove-gratuitous-uses-of-deprecated-SELECT-INTO.patchtext/plain; charset=UTF-8; name=v2-0001-Remove-gratuitous-uses-of-deprecated-SELECT-INTO.patch; x-mac-creator=0; x-mac-type=0Download
From ab0dc84e0519d618a316456598fc277a2f1c9f6f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 2 Dec 2020 12:09:39 +0100
Subject: [PATCH v2 1/2] Remove gratuitous uses of deprecated SELECT INTO
CREATE TABLE AS has been preferred over SELECT INTO (outside of ecpg
and PL/pgSQL) for a long time. There were still a few uses of SELECT
INTO in tests and documentation, some old, some more recent. This
changes them to CREATE TABLE AS. Some occurrences in the tests remain
where they are specifically testing SELECT INTO parsing or similar.
---
contrib/sepgsql/expected/label.out | 2 +-
contrib/sepgsql/sql/label.sql | 2 +-
doc/src/sgml/hstore.sgml | 2 +-
src/bin/pg_basebackup/t/010_pg_basebackup.pl | 4 ++--
src/bin/pg_checksums/t/002_actions.pl | 2 +-
src/test/regress/expected/create_index.out | 2 +-
src/test/regress/expected/create_misc.out | 2 +-
src/test/regress/expected/random.out | 3 ++-
src/test/regress/expected/select_implicit.out | 6 ++++--
src/test/regress/expected/select_implicit_1.out | 6 ++++--
src/test/regress/expected/select_implicit_2.out | 6 ++++--
src/test/regress/sql/create_index.sql | 2 +-
src/test/regress/sql/create_misc.sql | 2 +-
src/test/regress/sql/random.sql | 3 ++-
src/test/regress/sql/select_implicit.sql | 6 ++++--
15 files changed, 30 insertions(+), 20 deletions(-)
diff --git a/contrib/sepgsql/expected/label.out b/contrib/sepgsql/expected/label.out
index 0300bc6fb4..b1b7db55f6 100644
--- a/contrib/sepgsql/expected/label.out
+++ b/contrib/sepgsql/expected/label.out
@@ -6,7 +6,7 @@
--
CREATE TABLE t1 (a int, b text);
INSERT INTO t1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
-SELECT * INTO t2 FROM t1 WHERE a % 2 = 0;
+CREATE TABLE t2 AS SELECT * FROM t1 WHERE a % 2 = 0;
CREATE FUNCTION f1 () RETURNS text
AS 'SELECT sepgsql_getcon()'
LANGUAGE sql;
diff --git a/contrib/sepgsql/sql/label.sql b/contrib/sepgsql/sql/label.sql
index d19c6edb4c..76e261bee8 100644
--- a/contrib/sepgsql/sql/label.sql
+++ b/contrib/sepgsql/sql/label.sql
@@ -7,7 +7,7 @@
--
CREATE TABLE t1 (a int, b text);
INSERT INTO t1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
-SELECT * INTO t2 FROM t1 WHERE a % 2 = 0;
+CREATE TABLE t2 AS SELECT * FROM t1 WHERE a % 2 = 0;
CREATE FUNCTION f1 () RETURNS text
AS 'SELECT sepgsql_getcon()'
diff --git a/doc/src/sgml/hstore.sgml b/doc/src/sgml/hstore.sgml
index 14a36ade00..25904d9562 100644
--- a/doc/src/sgml/hstore.sgml
+++ b/doc/src/sgml/hstore.sgml
@@ -841,7 +841,7 @@ <title>Statistics</title>
<para>
Using a table:
<programlisting>
-SELECT (each(h)).key, (each(h)).value INTO stat FROM testhstore;
+CREATE TABLE stat AS SELECT (each(h)).key, (each(h)).value FROM testhstore;
</programlisting>
</para>
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index f674a7c94e..9eba7d8d7d 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -502,10 +502,10 @@
# create tables to corrupt and get their relfilenodes
my $file_corrupt1 = $node->safe_psql('postgres',
- q{SELECT a INTO corrupt1 FROM generate_series(1,10000) AS a; ALTER TABLE corrupt1 SET (autovacuum_enabled=false); SELECT pg_relation_filepath('corrupt1')}
+ q{CREATE TABLE corrupt1 AS SELECT a FROM generate_series(1,10000) AS a; ALTER TABLE corrupt1 SET (autovacuum_enabled=false); SELECT pg_relation_filepath('corrupt1')}
);
my $file_corrupt2 = $node->safe_psql('postgres',
- q{SELECT b INTO corrupt2 FROM generate_series(1,2) AS b; ALTER TABLE corrupt2 SET (autovacuum_enabled=false); SELECT pg_relation_filepath('corrupt2')}
+ q{CREATE TABLE corrupt2 AS SELECT b FROM generate_series(1,2) AS b; ALTER TABLE corrupt2 SET (autovacuum_enabled=false); SELECT pg_relation_filepath('corrupt2')}
);
# set page header and block sizes
diff --git a/src/bin/pg_checksums/t/002_actions.pl b/src/bin/pg_checksums/t/002_actions.pl
index 4e4934532a..8a81f36a06 100644
--- a/src/bin/pg_checksums/t/002_actions.pl
+++ b/src/bin/pg_checksums/t/002_actions.pl
@@ -21,7 +21,7 @@ sub check_relation_corruption
$node->safe_psql(
'postgres',
- "SELECT a INTO $table FROM generate_series(1,10000) AS a;
+ "CREATE TABLE $table AS SELECT a FROM generate_series(1,10000) AS a;
ALTER TABLE $table SET (autovacuum_enabled=false);");
$node->safe_psql('postgres',
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index fc6afab58a..ce734f7ef3 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -1582,7 +1582,7 @@ DROP TABLE syscol_table;
--
-- Tests for IS NULL/IS NOT NULL with b-tree indexes
--
-SELECT unique1, unique2 INTO onek_with_null FROM onek;
+CREATE TABLE onek_with_null AS SELECT unique1, unique2 FROM onek;
INSERT INTO onek_with_null (unique1,unique2) VALUES (NULL, -1), (NULL, NULL);
CREATE UNIQUE INDEX onek_nulltest ON onek_with_null (unique2,unique1);
SET enable_seqscan = OFF;
diff --git a/src/test/regress/expected/create_misc.out b/src/test/regress/expected/create_misc.out
index cee35ed02f..41bc4d7750 100644
--- a/src/test/regress/expected/create_misc.out
+++ b/src/test/regress/expected/create_misc.out
@@ -5,7 +5,7 @@
-- (any resemblance to real life is purely coincidental)
--
INSERT INTO tenk2 SELECT * FROM tenk1;
-SELECT * INTO TABLE onek2 FROM onek;
+CREATE TABLE onek2 AS SELECT * FROM onek;
INSERT INTO fast_emp4000 SELECT * FROM slow_emp4000;
SELECT *
INTO TABLE Bprime
diff --git a/src/test/regress/expected/random.out b/src/test/regress/expected/random.out
index 302c3d61c7..a919b28d8d 100644
--- a/src/test/regress/expected/random.out
+++ b/src/test/regress/expected/random.out
@@ -23,7 +23,8 @@ INTERSECT
(0 rows)
-- count roughly 1/10 of the tuples
-SELECT count(*) AS random INTO RANDOM_TBL
+CREATE TABLE RANDOM_TBL AS
+ SELECT count(*) AS random
FROM onek WHERE random() < 1.0/10;
-- select again, the count should be different
INSERT INTO RANDOM_TBL (random)
diff --git a/src/test/regress/expected/select_implicit.out b/src/test/regress/expected/select_implicit.out
index 61b485fdaa..27c07de92c 100644
--- a/src/test/regress/expected/select_implicit.out
+++ b/src/test/regress/expected/select_implicit.out
@@ -202,7 +202,8 @@ SELECT count(*) FROM test_missing_target x, test_missing_target y
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(*) INTO TABLE test_missing_target2
+CREATE TABLE test_missing_target2 AS
+SELECT count(*)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b ORDER BY x.b;
@@ -318,7 +319,8 @@ LINE 1: SELECT count(b) FROM test_missing_target x, test_missing_tar...
^
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(x.b) INTO TABLE test_missing_target3
+CREATE TABLE test_missing_target3 AS
+SELECT count(x.b)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b/2 ORDER BY x.b/2;
diff --git a/src/test/regress/expected/select_implicit_1.out b/src/test/regress/expected/select_implicit_1.out
index f277375ebf..d67521e8f8 100644
--- a/src/test/regress/expected/select_implicit_1.out
+++ b/src/test/regress/expected/select_implicit_1.out
@@ -202,7 +202,8 @@ SELECT count(*) FROM test_missing_target x, test_missing_target y
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(*) INTO TABLE test_missing_target2
+CREATE TABLE test_missing_target2 AS
+SELECT count(*)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b ORDER BY x.b;
@@ -318,7 +319,8 @@ LINE 1: SELECT count(b) FROM test_missing_target x, test_missing_tar...
^
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(x.b) INTO TABLE test_missing_target3
+CREATE TABLE test_missing_target3 AS
+SELECT count(x.b)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b/2 ORDER BY x.b/2;
diff --git a/src/test/regress/expected/select_implicit_2.out b/src/test/regress/expected/select_implicit_2.out
index 91c3a24f92..7a353d0862 100644
--- a/src/test/regress/expected/select_implicit_2.out
+++ b/src/test/regress/expected/select_implicit_2.out
@@ -202,7 +202,8 @@ SELECT count(*) FROM test_missing_target x, test_missing_target y
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(*) INTO TABLE test_missing_target2
+CREATE TABLE test_missing_target2 AS
+SELECT count(*)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b ORDER BY x.b;
@@ -318,7 +319,8 @@ LINE 1: SELECT count(b) FROM test_missing_target x, test_missing_tar...
^
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(x.b) INTO TABLE test_missing_target3
+CREATE TABLE test_missing_target3 AS
+SELECT count(x.b)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b/2 ORDER BY x.b/2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 824cb9f9e8..fd4f30876e 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -609,7 +609,7 @@ CREATE INDEX ON syscol_table (a) WHERE ctid >= '(1000,0)';
-- Tests for IS NULL/IS NOT NULL with b-tree indexes
--
-SELECT unique1, unique2 INTO onek_with_null FROM onek;
+CREATE TABLE onek_with_null AS SELECT unique1, unique2 FROM onek;
INSERT INTO onek_with_null (unique1,unique2) VALUES (NULL, -1), (NULL, NULL);
CREATE UNIQUE INDEX onek_nulltest ON onek_with_null (unique2,unique1);
diff --git a/src/test/regress/sql/create_misc.sql b/src/test/regress/sql/create_misc.sql
index d0b04a821f..c7d0d064c3 100644
--- a/src/test/regress/sql/create_misc.sql
+++ b/src/test/regress/sql/create_misc.sql
@@ -8,7 +8,7 @@
INSERT INTO tenk2 SELECT * FROM tenk1;
-SELECT * INTO TABLE onek2 FROM onek;
+CREATE TABLE onek2 AS SELECT * FROM onek;
INSERT INTO fast_emp4000 SELECT * FROM slow_emp4000;
diff --git a/src/test/regress/sql/random.sql b/src/test/regress/sql/random.sql
index ae6b70a157..8187b2c288 100644
--- a/src/test/regress/sql/random.sql
+++ b/src/test/regress/sql/random.sql
@@ -17,7 +17,8 @@
FROM onek ORDER BY random() LIMIT 1);
-- count roughly 1/10 of the tuples
-SELECT count(*) AS random INTO RANDOM_TBL
+CREATE TABLE RANDOM_TBL AS
+ SELECT count(*) AS random
FROM onek WHERE random() < 1.0/10;
-- select again, the count should be different
diff --git a/src/test/regress/sql/select_implicit.sql b/src/test/regress/sql/select_implicit.sql
index d815504222..de3aef8d81 100644
--- a/src/test/regress/sql/select_implicit.sql
+++ b/src/test/regress/sql/select_implicit.sql
@@ -86,7 +86,8 @@ CREATE TABLE test_missing_target (a int, b int, c char(8), d char);
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(*) INTO TABLE test_missing_target2
+CREATE TABLE test_missing_target2 AS
+SELECT count(*)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b ORDER BY x.b;
@@ -142,7 +143,8 @@ CREATE TABLE test_missing_target (a int, b int, c char(8), d char);
-- group w/o existing GROUP BY target under ambiguous condition
-- into a table
-SELECT count(x.b) INTO TABLE test_missing_target3
+CREATE TABLE test_missing_target3 AS
+SELECT count(x.b)
FROM test_missing_target x, test_missing_target y
WHERE x.a = y.a
GROUP BY x.b/2 ORDER BY x.b/2;
--
2.29.2
v2-0002-doc-Clarify-status-of-SELECT-INTO-on-reference-pa.patchtext/plain; charset=UTF-8; name=v2-0002-doc-Clarify-status-of-SELECT-INTO-on-reference-pa.patch; x-mac-creator=0; x-mac-type=0Download
From 82f9247779658915fe0e57a8797622483de7e4c3 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 9 Dec 2020 21:35:14 +0100
Subject: [PATCH v2 2/2] doc: Clarify status of SELECT INTO on reference page
The documentation as well as source code comments weren't entirely
clear whether SELECT INTO was truly deprecated (thus in theory
destined to be removed eventually), or just a less recommended
variant. After discussion, it appears that other implementations also
use SELECT INTO in direct SQL in a way similar to PostgreSQL, so it
seems worth keeping it for compatibility. Update the language in the
documentation to that effect.
Discussion: https://www.postgresql.org/message-id/flat/96dc0df3-e13a-a85d-d045-d6e2c85218da%40enterprisedb.com
---
doc/src/sgml/ref/select_into.sgml | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/ref/select_into.sgml b/doc/src/sgml/ref/select_into.sgml
index 7b327d9eee..acc6401485 100644
--- a/doc/src/sgml/ref/select_into.sgml
+++ b/doc/src/sgml/ref/select_into.sgml
@@ -138,9 +138,11 @@ <title>Compatibility</title>
in <application>ECPG</application> (see <xref linkend="ecpg"/>) and
<application>PL/pgSQL</application> (see <xref linkend="plpgsql"/>).
The <productname>PostgreSQL</productname> usage of <command>SELECT
- INTO</command> to represent table creation is historical. It is
- best to use <command>CREATE TABLE AS</command> for this purpose in
- new code.
+ INTO</command> to represent table creation is historical. Some other SQL
+ implementations also use <command>SELECT INTO</command> in this way (but
+ most SQL implementations support <command>CREATE TABLE AS</command>
+ instead). Apart from such compatibility considerations, it is best to use
+ <command>CREATE TABLE AS</command> for this purpose in new code.
</para>
</refsect1>
--
2.29.2
On Wed, Dec 9, 2020 at 09:48:54PM +0100, Peter Eisentraut wrote:
On 2020-12-03 20:26, Peter Eisentraut wrote:
On 2020-12-03 16:34, Tom Lane wrote:
As I recall, a whole lot of the pain we have with INTO has to do
with the semantics we've chosen for INTO in a set-operation nest.
We think you can write something likeSELECT ... INTO foo FROM ... UNION SELECT ... FROM ...
but we insist on the INTO being in the first component SELECT.
I'd like to know exactly how much of that messiness is shared
by SQL Server.On sqlfiddle.com, this works:
select a into t3 from t1 union select a from t2;
but this gets an error:
select a from t1 union select a into t4 from t2;
SELECT INTO must be the first query in a statement containing a UNION,
INTERSECT or EXCEPT operator.So, with that in mind, here is an alternative proposal that points out that
SELECT INTO has some use for compatibility.
Do we really want to carry around confusing syntax for compatibility? I
doubt we would ever add INTO now, even for compatibility.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On 2020-12-15 23:13, Bruce Momjian wrote:
Do we really want to carry around confusing syntax for compatibility? I
doubt we would ever add INTO now, even for compatibility.
Right, we would very likely not add it now. But it doesn't seem to
cause a lot of ongoing maintenance burden, so if there is a use case,
it's not unreasonable to keep it around. I have no other motive here, I
was just curious.
On Wed, Dec 16, 2020 at 06:07:08PM +0100, Peter Eisentraut wrote:
Right, we would very likely not add it now. But it doesn't seem to cause a
lot of ongoing maintenance burden, so if there is a use case, it's not
unreasonable to keep it around. I have no other motive here, I was just
curious.
From the point of view of the code, this would simplify the grammar
rules of SELECT, which is a good thing. And this would also simplify
some code in the rewriter where we create some CreateTableAsStmt
on-the-fly where an IntoClause is specified, but my take is that this
is not really a burden when it comes to long-term maintenance. And
the git history tells, at least it seems to me so, that this is not
manipulated much.
--
Michael
On 2020-12-17 02:30, Michael Paquier wrote:
On Wed, Dec 16, 2020 at 06:07:08PM +0100, Peter Eisentraut wrote:
Right, we would very likely not add it now. But it doesn't seem to cause a
lot of ongoing maintenance burden, so if there is a use case, it's not
unreasonable to keep it around. I have no other motive here, I was just
curious.From the point of view of the code, this would simplify the grammar
rules of SELECT, which is a good thing. And this would also simplify
some code in the rewriter where we create some CreateTableAsStmt
on-the-fly where an IntoClause is specified, but my take is that this
is not really a burden when it comes to long-term maintenance. And
the git history tells, at least it seems to me so, that this is not
manipulated much.
I have committed my small documentation patch that points out
compatibility with other implementations.
On 12/15/20 5:13 PM, Bruce Momjian wrote:
On Wed, Dec 9, 2020 at 09:48:54PM +0100, Peter Eisentraut wrote:
On 2020-12-03 20:26, Peter Eisentraut wrote:
On 2020-12-03 16:34, Tom Lane wrote:
As I recall, a whole lot of the pain we have with INTO has to do
with the semantics we've chosen for INTO in a set-operation nest.
We think you can write something likeSELECT ... INTO foo FROM ... UNION SELECT ... FROM ...
but we insist on the INTO being in the first component SELECT.
I'd like to know exactly how much of that messiness is shared
by SQL Server.On sqlfiddle.com, this works:
select a into t3 from t1 union select a from t2;
but this gets an error:
select a from t1 union select a into t4 from t2;
SELECT INTO must be the first query in a statement containing a UNION,
INTERSECT or EXCEPT operator.So, with that in mind, here is an alternative proposal that points out that
SELECT INTO has some use for compatibility.Do we really want to carry around confusing syntax for compatibility? I
doubt we would ever add INTO now, even for compatibility.
If memory serves the INTO syntax is a result from the first incarnation
of PL/pgSQL being based on the Oracle PL/SQL syntax. I think it has been
there from the very beginning, which makes it likely that by now a lot
of migrants are using it in rather old code.
I don't think it should be our business to throw wrenches into their
existing applications.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services