Doc Patch: Subquery section to say that subqueries can't modify data
Hi,
The attached documentation patch, doc-subqueries-v1.patch,
applies against head.
I wanted to document that subqueries can't modify data.
This is mentioned in the documentation for SELECT and
implied elsewhere but I was looking for something more
than an 'in-passing' mention.
(I wrote a bad query,
modifying data in a subquery, couldn't recall where
it was documented that you can't do this, and couldn't
find the answer from the TOC or the index. Now that
there's lots of statements with RETURNING clauses
it's natural to want to use them in subqueries.)
There seemed to be no good place to put this in the
tutorial section of the documentation. I wound
up adding a small, 2 paragraph, section describing subqueries to
the chapter on queries. Although the first paragraph
echos what's already documented the utility of
subqueries is such that it's nice to have a
place in the tutorial that serves as a single point of
reference.
The alternative seemed to be to put the 2nd paragraph
in "9.22. Subquery Expressions", and this didn't seem
to fit well.
The last 2 sentences of the first paragraph are
something in the way of helpful hints and may not
be appropriate, or even accurate. I've left them in
for review.
Regards,
Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
Attachments:
doc-subqueries-v1.patchtext/x-patch; charset=us-ascii; name=doc-subqueries-v1.patchDownload
diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index d7b0d73..acc6686 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -549,7 +549,7 @@ SELECT * FROM my_table AS m WHERE my_table.a > 5; -- wrong
SELECT * FROM people AS mother JOIN people AS child ON mother.id = child.mother_id;
</programlisting>
Additionally, an alias is required if the table reference is a
- subquery (see <xref linkend="queries-subqueries">).
+ subquery (see <xref linkend="queries-subquery-derived-tables">).
</para>
<para>
@@ -590,10 +590,10 @@ SELECT a.* FROM (my_table AS a JOIN your_table AS b ON ...) AS c
</para>
</sect3>
- <sect3 id="queries-subqueries">
- <title>Subqueries</title>
+ <sect3 id="queries-subquery-derived-tables">
+ <title>Subquery Derived Tables</title>
- <indexterm zone="queries-subqueries">
+ <indexterm zone="queries-subquery-derived-tables">
<primary>subquery</primary>
</indexterm>
@@ -1315,6 +1315,46 @@ SELECT DISTINCT ON (<replaceable>expression</replaceable> <optional>, <replaceab
</sect1>
+ <sect1 id="queries-subqueries">
+ <title>Subqueries</title>
+
+ <indexterm zone="queries-subqueries">
+ <primary>subquery</primary>
+ </indexterm>
+
+ <indexterm zone="queries-subqueries">
+ <primary>sub-select</primary>
+ </indexterm>
+
+ <para>
+ Subqueries, also called sub-selects, are queries written within
+ parenthesis in the text of larger queries. The values produced by
+ subqueries may be scalar, used within expressions as described
+ in <xref linkend="sql-syntax-scalar-subqueries">, or tabular. When
+ tabular, subqueries may substitute for tables, as described
+ in <xref linkend="queries-subquery-derived-tables">, generate array
+ content, as described
+ in <xref linkend="sql-syntax-array-constructors">, have their
+ result content tested within expressions, as described
+ in <xref linkend="functions-subquery">, or be used in other
+ contexts. Often either joins or subqueries can be used to produce
+ different query plans yielding identical output. Which technique
+ is appropriate depends upon circumstance but it is worth noting
+ that more work has gone into query planner join optimization.
+ </para>
+
+ <para>
+ Subqueries may not modify database
+ content. <link linkend="queries-with">Common Table
+ Expressions</link> are one way to integrate data returned by data
+ modification statements,
+ i.e. <command>INSERT</command>/<command>UPDATE</command>/<command>DELETE</command>
+ statements with <literal>RETURNING</literal> clauses, into larger
+ queries.
+ </para>
+ </sect1>
+
+
<sect1 id="queries-union">
<title>Combining Queries</title>
Instead of simply expanding the section on sub-queries, which may still be
worthwhile, it seems that we have effectively introduced a new "kind" of
query - namely one that mixes both query DDL and update DDL into a kind of
hybrid query. An entire section describing the means to implement these
queries and the limitations thereof would seem advisable as the current
material is spread throughout the documentation.
Some areas to address would:
Select queries that cause/utilize:
function-based modifications
CTE-based modifications
FDW/dblink-based modifications
I guess the main question is if someone were to put this together would it
likely be included in the "queries" section of the documentation. Also, are
there any other thoughts to add; and is something like this documented in a
ToDo somewhere already?
The proposed patch; while warranting a technical review (namely that the
presence of functions in a sub-select can cause the sub-query to update the
database), seems to add one more place to go find this information without
adding a central index or summary that someone learning the system could
directly comprehend/learn as opposed to it being some allowed/disallowed
side-effect to something else.
David J.
--
View this message in context: http://postgresql.1045698.n5.nabble.com/Doc-Patch-Subquery-section-to-say-that-subqueries-can-t-modify-data-tp5766574p5766580.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Good points.
On 08/06/2013 05:15:28 PM, David Johnston wrote:
Instead of simply expanding the section on sub-queries, which may
still be
worthwhile, it seems that we have effectively introduced a new "kind"
of
query - namely one that mixes both query DDL and update DDL into a
kind of
hybrid query. An entire section describing the means to implement
these
queries and the limitations thereof would seem advisable as the
current
material is spread throughout the documentation.
The proposed patch is an attempt to provide a base upon
which to build such a section.
Some areas to address would:
Select queries that cause/utilize:
function-based modifications
CTE-based modifications
FDW/dblink-based modifications
While it'd be nice to have a full set of examples and an
exhaustive list of what constitutes modification and
what does not it should be enough to state where, in
this sort of hybrid query, modification is allowed
and where not. Of course more detail is needed if
the different kinds of modification above are restricted
in different ways.
I guess the main question is if someone were to put this together
would it
likely be included in the "queries" section of the documentation.
If this section is not to be part of the Query chapter then it
surely also does not belong in the Data Manipulation chapter
(or the Data Definition chapter),
if for no other reason than the information presented in the
Query chapter is necessary to understand the subject.
To me that means it needs it's own chapter, probably immediately
following the Query chapter. I can't think what to call
such a chapter.
The proposed patch; while warranting a technical review (namely that
the
presence of functions in a sub-select can cause the sub-query to
update the
database), seems to add one more place to go find this information
without
adding a central index or summary that someone learning the system
could
directly comprehend/learn as opposed to it being some
allowed/disallowed
side-effect to something else.
I'm less worried about data modifying functions than I am
about the patch's language regards other sorts of modifications.
Although unstated, it should be clear that data modifying
functions that are executed with a SELECT statement do modify data.
Where the patch is lacking is noting that schema alterations and FDW
modifications also have restrictions. I don't feel particularly
qualified regards where either are allowed, or not, although
I could probably get it right after some research.
Regards,
Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 08/06/2013 11:03 PM, Karl O. Pinc wrote:
The attached documentation patch, doc-subqueries-v1.patch,
applies against head.I wanted to document that subqueries can't modify data.
This is mentioned in the documentation for SELECT and
implied elsewhere but I was looking for something more
than an 'in-passing' mention.(I wrote a bad query,
modifying data in a subquery, couldn't recall where
it was documented that you can't do this, and couldn't
find the answer from the TOC or the index. Now that
there's lots of statements with RETURNING clauses
it's natural to want to use them in subqueries.)
Hello, I am (finally) reviewing this patch.
After reading your reasoning, David's rebuttal, and the patch itself;
I'm wondering if this is needed or wanted at all.
Supposing it is wanted, it creates more questions than it answers. The
two biggies are:
* In what other contexts can tabular subqueries be used?
* What are other ways of integrating data returned by data modification
statements?
On a superficial level I find the number of commas a bit clunky, and
"parentheses" is misspelled.
The last 2 sentences of the first paragraph are
something in the way of helpful hints and may not
be appropriate, or even accurate. I've left them in
for review.
I think the last sentence (of the first paragraph) is a bit much, but
the penultimate seems fine.
I'm attaching an updated patch that I think is an improvement but it's
still at a draft level and needs more copyediting. This new patch does
not attempt to answer the two questions above.
--
Vik
Attachments:
doc-subqueries-v2.patchtext/x-patch; name=doc-subqueries-v2.patchDownload
diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index c32c857..b134b66 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -549,7 +549,7 @@ SELECT * FROM my_table AS m WHERE my_table.a > 5; -- wrong
SELECT * FROM people AS mother JOIN people AS child ON mother.id = child.mother_id;
</programlisting>
Additionally, an alias is required if the table reference is a
- subquery (see <xref linkend="queries-subqueries">).
+ subquery (see <xref linkend="queries-subquery-derived-tables">).
</para>
<para>
@@ -590,10 +590,10 @@ SELECT a.* FROM (my_table AS a JOIN your_table AS b ON ...) AS c
</para>
</sect3>
- <sect3 id="queries-subqueries">
- <title>Subqueries</title>
+ <sect3 id="queries-subquery-derived-tables">
+ <title>Subquery Derived Tables</title>
- <indexterm zone="queries-subqueries">
+ <indexterm zone="queries-subquery-derived-tables">
<primary>subquery</primary>
</indexterm>
@@ -1315,6 +1315,44 @@ SELECT DISTINCT ON (<replaceable>expression</replaceable> <optional>, <replaceab
</sect1>
+ <sect1 id="queries-subqueries">
+ <title>Subqueries</title>
+
+ <indexterm zone="queries-subqueries">
+ <primary>subquery</primary>
+ </indexterm>
+
+ <indexterm zone="queries-subqueries">
+ <primary>sub-select</primary>
+ </indexterm>
+
+ <para>
+ Subqueries, also called sub-selects, are queries written within
+ parentheses in the text of larger queries. The values produced by
+ subqueries may be scalar, or tabular. Scalar subqueries are used within expressions as described
+ in <xref linkend="sql-syntax-scalar-subqueries">.
+ Tabular subqueries may substitute for tables as described
+ in <xref linkend="queries-subquery-derived-tables">, generate array
+ content as described
+ in <xref linkend="sql-syntax-array-constructors">, have their
+ result content tested within expressions as described
+ in <xref linkend="functions-subquery">, or be used in other
+ contexts. Often either joins or subqueries can be used to produce
+ different query plans yielding identical output.
+ </para>
+
+ <para>
+ Subqueries may not modify database
+ content. <link linkend="queries-with">Common Table
+ Expressions</link> are one way to integrate data returned by data
+ modification statements,
+ i.e. <command>INSERT</command>/<command>UPDATE</command>/<command>DELETE</command>
+ statements with <literal>RETURNING</literal> clauses, into larger
+ queries.
+ </para>
+ </sect1>
+
+
<sect1 id="queries-union">
<title>Combining Queries</title>