[patch] [doc] Minor variable related cleanup and rewording of plpgsql docs
Hackers,
Bug # 16519 [1]/messages/by-id/16519-9ef04828d058a319@postgresql.org is another report of confusion regarding trying to use
parameters in improper locations - specifically the SET ROLE command within
pl/pgsql. I'm re-attaching the doc patch and am adding it to the
commitfest.
David J.
[1]: /messages/by-id/16519-9ef04828d058a319@postgresql.org
/messages/by-id/16519-9ef04828d058a319@postgresql.org
Attachments:
v1-doc-plpgsql-variable-usage-cleanup.patchapplication/octet-stream; name=v1-doc-plpgsql-variable-usage-cleanup.patchDownload+16-20
čt 26. 11. 2020 v 6:41 odesílatel David G. Johnston <
david.g.johnston@gmail.com> napsal:
Hackers,
Bug # 16519 [1] is another report of confusion regarding trying to use
parameters in improper locations - specifically the SET ROLE command within
pl/pgsql. I'm re-attaching the doc patch and am adding it to the
commitfest.
I checked this patch, and I think so it is correct - my comments are just
about enhancing by some examples
Maybe for following sentence the some examples can be practical
+ If the SQL command being executed is incapable of returning a result
+ it does not accept query parameters.
</para>
+ it does not accept query parameters (usually DDL commands like CREATE
TABLE, DROP TABLE, ALTER ... ).
+ Query parameters will only be substituted in places where
syntactically allowed
+ (in particular, identifiers can never be replaced with a query
parameter.)
+ As an extreme case, consider this example of poor programming style:
In this case, I miss the more precious specification of identifiers
+ (in particular, SQL identifiers (like schema, table, column names) can
never be replaced with a query parameter.)
Regards
Pavel
Show quoted text
David J.
On Thu, Nov 26, 2020 at 12:49 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:
čt 26. 11. 2020 v 6:41 odesílatel David G. Johnston <
david.g.johnston@gmail.com> napsal:Hackers,
Bug # 16519 [1] is another report of confusion regarding trying to use
parameters in improper locations - specifically the SET ROLE command within
pl/pgsql. I'm re-attaching the doc patch and am adding it to the
commitfest.I checked this patch, and I think so it is correct - my comments are just
about enhancing by some examples
Thank you for the review.
v2 attached.
I added examples in the two places you noted.
Upon re-reading, I decided that opening up the section by including
everything then fitting in parameters with an exception for utility
commands (without previously/otherwise identifying them) forced some
undesirable verbosity. Instead, I opened up with the utility commands as
the main body of non-result returning commands and then moved onto
delete/insert/update non-returning cases when the subsequent paragraph
regarding parameters can then refer to the second class (by way of
excluding the first class). This seems to flow better, IMO.
David J.
Attachments:
v2-doc-plpgsql-variable-usage-cleanup.patchapplication/octet-stream; name=v2-doc-plpgsql-variable-usage-cleanup.patchDownload+27-24
po 30. 11. 2020 v 4:24 odesílatel David G. Johnston <
david.g.johnston@gmail.com> napsal:
On Thu, Nov 26, 2020 at 12:49 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:čt 26. 11. 2020 v 6:41 odesílatel David G. Johnston <
david.g.johnston@gmail.com> napsal:Hackers,
Bug # 16519 [1] is another report of confusion regarding trying to use
parameters in improper locations - specifically the SET ROLE command within
pl/pgsql. I'm re-attaching the doc patch and am adding it to the
commitfest.I checked this patch, and I think so it is correct - my comments are just
about enhancing by some examplesThank you for the review.
v2 attached.
I added examples in the two places you noted.
Upon re-reading, I decided that opening up the section by including
everything then fitting in parameters with an exception for utility
commands (without previously/otherwise identifying them) forced some
undesirable verbosity. Instead, I opened up with the utility commands as
the main body of non-result returning commands and then moved onto
delete/insert/update non-returning cases when the subsequent paragraph
regarding parameters can then refer to the second class (by way of
excluding the first class). This seems to flow better, IMO.
I have no objections, but maybe these pages are a little bit unclear
generally, because the core of the problem is not described.
Personally I miss a description of the reason why variables cannot be used
- the description "variables cannot be used in statements without result"
is true, but it is not core information.
The important fact is usage of an execution plan or not. The statements
with an execution plan can be parametrized (DML - INSERT, UPDATE, DELETE),
and SELECT. The statements without execution plans should not be
parametrized. The only execution via execution plan executor allows
parametrization. DDL statements are executed via utility execution, and
their parameterization is not supported.
Regards
Pavel
Show quoted text
David J.
On Mon, Nov 30, 2020 at 12:51 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:
po 30. 11. 2020 v 4:24 odesílatel David G. Johnston <
david.g.johnston@gmail.com> napsal:On Thu, Nov 26, 2020 at 12:49 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:čt 26. 11. 2020 v 6:41 odesílatel David G. Johnston <
david.g.johnston@gmail.com> napsal:Hackers,
Bug # 16519 [1] is another report of confusion regarding trying to use
parameters in improper locations - specifically the SET ROLE command within
pl/pgsql. I'm re-attaching the doc patch and am adding it to the
commitfest.I checked this patch, and I think so it is correct - my comments are
just about enhancing by some examplesThank you for the review.
v2 attached.
I added examples in the two places you noted.
Upon re-reading, I decided that opening up the section by including
everything then fitting in parameters with an exception for utility
commands (without previously/otherwise identifying them) forced some
undesirable verbosity. Instead, I opened up with the utility commands as
the main body of non-result returning commands and then moved onto
delete/insert/update non-returning cases when the subsequent paragraph
regarding parameters can then refer to the second class (by way of
excluding the first class). This seems to flow better, IMO.I have no objections, but maybe these pages are a little bit unclear
generally, because the core of the problem is not described.
Personally I miss a description of the reason why variables cannot be used
- the description "variables cannot be used in statements without result"
is true, but it is not core information.
In the section "executing commands that don't return results" it does seem
like core information...but I get your point.
The important fact is usage of an execution plan or not.
This is already mentioned in the linked-to section:
"Variable substitution currently works only in SELECT, INSERT, UPDATE, and
DELETE commands, because the main SQL engine allows query parameters only
in these commands. To use a non-constant name or value in other statement
types (generically called utility statements), you must construct the
utility statement as a string and EXECUTE it."
I didn't feel the need to repeat that material in full in the "no results"
section. I left that pointing out the "results" dynamic there would be
useful since the original wording seemed to forget about the presence of
utility commands altogether which was confusing for that section.
David J.
po 30. 11. 2020 v 16:06 odesílatel David G. Johnston <
david.g.johnston@gmail.com> napsal:
On Mon, Nov 30, 2020 at 12:51 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:po 30. 11. 2020 v 4:24 odesílatel David G. Johnston <
david.g.johnston@gmail.com> napsal:On Thu, Nov 26, 2020 at 12:49 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:čt 26. 11. 2020 v 6:41 odesílatel David G. Johnston <
david.g.johnston@gmail.com> napsal:Hackers,
Bug # 16519 [1] is another report of confusion regarding trying to use
parameters in improper locations - specifically the SET ROLE command within
pl/pgsql. I'm re-attaching the doc patch and am adding it to the
commitfest.I checked this patch, and I think so it is correct - my comments are
just about enhancing by some examplesThank you for the review.
v2 attached.
I added examples in the two places you noted.
Upon re-reading, I decided that opening up the section by including
everything then fitting in parameters with an exception for utility
commands (without previously/otherwise identifying them) forced some
undesirable verbosity. Instead, I opened up with the utility commands as
the main body of non-result returning commands and then moved onto
delete/insert/update non-returning cases when the subsequent paragraph
regarding parameters can then refer to the second class (by way of
excluding the first class). This seems to flow better, IMO.I have no objections, but maybe these pages are a little bit unclear
generally, because the core of the problem is not described.Personally I miss a description of the reason why variables cannot be
used - the description "variables cannot be used in statements without
result" is true, but it is not core information.In the section "executing commands that don't return results" it does seem
like core information...but I get your point.The important fact is usage of an execution plan or not.
This is already mentioned in the linked-to section:
"Variable substitution currently works only in SELECT, INSERT, UPDATE, and
DELETE commands, because the main SQL engine allows query parameters only
in these commands. To use a non-constant name or value in other statement
types (generically called utility statements), you must construct the
utility statement as a string and EXECUTE it."I didn't feel the need to repeat that material in full in the "no results"
section. I left that pointing out the "results" dynamic there would be
useful since the original wording seemed to forget about the presence of
utility commands altogether which was confusing for that section.
ok
Pavel
Show quoted text
David J.
On 11/30/20 10:37 AM, Pavel Stehule wrote:
po 30. 11. 2020 v 16:06 odesílatel David G. Johnston
ok
This patch looks reasonable to me overall.
A few comments:
1) PL/SQL seems to be used in a few places where I believe PL/pgSQL is
meant. This was pre-existing but now seems like a good opportunity to
fix it, unless I am misunderstanding.
2) I think:
+ makes the command behave like <command>SELECT</command>, which is
described
flows a little better as:
+ makes the command behave like <command>SELECT</command>, as described
Regards,
--
-David
david@pgmasters.net
út 9. 3. 2021 v 18:03 odesílatel David Steele <david@pgmasters.net> napsal:
On 11/30/20 10:37 AM, Pavel Stehule wrote:
po 30. 11. 2020 v 16:06 odesílatel David G. Johnston
ok
This patch looks reasonable to me overall.
A few comments:
1) PL/SQL seems to be used in a few places where I believe PL/pgSQL is
meant. This was pre-existing but now seems like a good opportunity to
fix it, unless I am misunderstanding.
+1
2) I think:
+ makes the command behave like <command>SELECT</command>, which is
describedflows a little better as:
+ makes the command behave like <command>SELECT</command>, as described
I am not native speaker, so I am not able to evaluate it.
Regards
Pavel
Show quoted text
Regards,
--
-David
david@pgmasters.net
David Steele <david@pgmasters.net> writes:
1) PL/SQL seems to be used in a few places where I believe PL/pgSQL is
meant. This was pre-existing but now seems like a good opportunity to
fix it, unless I am misunderstanding.
PL/SQL is Oracle's function language, which PL/pgSQL is modeled on.
At least some of the mentions of PL/SQL are probably intentional,
so you'll have to look closely not just search-and-replace.
regards, tom lane
On Tue, Mar 9, 2021 at 10:45 AM Pavel Stehule <pavel.stehule@gmail.com>
wrote:
út 9. 3. 2021 v 18:03 odesílatel David Steele <david@pgmasters.net>
napsal:On 11/30/20 10:37 AM, Pavel Stehule wrote:
po 30. 11. 2020 v 16:06 odesílatel David G. Johnston
ok
This patch looks reasonable to me overall.
A few comments:
1) PL/SQL seems to be used in a few places where I believe PL/pgSQL is
meant. This was pre-existing but now seems like a good opportunity to
fix it, unless I am misunderstanding.+1
I vaguely recall looking for this back in October and not finding anything
that needed fixing in the area I was working in. The ready-for-commit can
stand without further investigation. Feel free to look for and fix
oversights of this nature if you feel they exist.
2) I think:
+ makes the command behave like <command>SELECT</command>, which is
describedflows a little better as:
+ makes the command behave like <command>SELECT</command>, as
describedI am not native speaker, so I am not able to evaluate it.
"which is described" is perfectly valid. I don't know that "as described"
is materially better from a flow perspective (I agree it reads a tiny bit
better) but either seems to adequately communicate the intended point so I
wouldn't gripe if it was changed during commit.
I intend to leave the patch as-is though since as written it is
committable, this second comment is just style and the first is scope creep.
David J.
On 3/9/21 1:08 PM, Tom Lane wrote:
David Steele <david@pgmasters.net> writes:
1) PL/SQL seems to be used in a few places where I believe PL/pgSQL is
meant. This was pre-existing but now seems like a good opportunity to
fix it, unless I am misunderstanding.PL/SQL is Oracle's function language, which PL/pgSQL is modeled on.
At least some of the mentions of PL/SQL are probably intentional,
so you'll have to look closely not just search-and-replace.
Ah, yes. That's what I get for just reading the patch and not looking at
the larger context.
Regards,
--
-David
david@pgmasters.net
I looked over the v2 patch. Parts of it seem like improvements but
other parts definitely don't. In particular, I thought you introduced
a great deal of confusion in 43.5.2 (Executing a Command with No Result).
The statement that you can write a non-result-returning SQL command as-is
is true in general, and ought not be confused with the question of whether
you can insert variable values into it. Also, starting with a spongy
definition of "utility command" and then contrasting with that does not
seem to me to add clarity.
I attach a v3 that I like better, although there's room to disagree
about that. I've always felt that the separation between 43.5.2 and
43.5.3 was rather artificial --- it's okay I guess for describing
how to handle command output, but we end up with considerable
duplication when it comes to describing how to insert values into a
command. It's tempting to try re-splitting it to separate optimizable
from non-optimizable statements; but maybe that'd just end with
different duplication.
regards, tom lane
Attachments:
v3-doc-plpgsql-variable-usage-cleanup.patchtext/x-diff; charset=us-ascii; name=v3-doc-plpgsql-variable-usage-cleanup.patchDownload+26-20
Hi
pá 12. 3. 2021 v 21:08 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I looked over the v2 patch. Parts of it seem like improvements but
other parts definitely don't. In particular, I thought you introduced
a great deal of confusion in 43.5.2 (Executing a Command with No Result).
The statement that you can write a non-result-returning SQL command as-is
is true in general, and ought not be confused with the question of whether
you can insert variable values into it. Also, starting with a spongy
definition of "utility command" and then contrasting with that does not
seem to me to add clarity.I attach a v3 that I like better, although there's room to disagree
about that. I've always felt that the separation between 43.5.2 and
43.5.3 was rather artificial --- it's okay I guess for describing
how to handle command output, but we end up with considerable
duplication when it comes to describing how to insert values into a
command. It's tempting to try re-splitting it to separate optimizable
from non-optimizable statements; but maybe that'd just end with
different duplication.
I am not sure if people can understand the "optimizable command" term. More
common categories are DML, DDL and SELECT. Maybe it is easier to say. DDL
statements don't support parametrizations, and then the variables cannot be
used there.
Show quoted text
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes:
pá 12. 3. 2021 v 21:08 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I attach a v3 that I like better, although there's room to disagree
about that.
I am not sure if people can understand the "optimizable command" term. More
common categories are DML, DDL and SELECT. Maybe it is easier to say. DDL
statements don't support parametrizations, and then the variables cannot be
used there.
Yeah, but DML/DDL is a pretty squishy separation as well, besides
which it'd mislead people for cases such as CREATE TABLE AS SELECT.
(Admittedly, I didn't mention that in my version either, but if you
think in terms of whether the optimizer will be applied then you
will draw the right conclusion.)
Maybe there's no way out but to specifically list the statement types
we can insert query parameters in.
regards, tom lane
pá 12. 3. 2021 v 21:36 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
pá 12. 3. 2021 v 21:08 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I attach a v3 that I like better, although there's room to disagree
about that.I am not sure if people can understand the "optimizable command" term.
More
common categories are DML, DDL and SELECT. Maybe it is easier to say. DDL
statements don't support parametrizations, and then the variables cannotbe
used there.
Yeah, but DML/DDL is a pretty squishy separation as well, besides
which it'd mislead people for cases such as CREATE TABLE AS SELECT.
(Admittedly, I didn't mention that in my version either, but if you
think in terms of whether the optimizer will be applied then you
will draw the right conclusion.)
Can it be better to use word planner instead of optimizer? An optimization
is too common a word, and unfortunately a lot of people have no idea what
optimization in SQL means.
It can be pretty hard, because the people that have problems here don't
know what is a plan or what is an optimization.
Maybe there's no way out but to specifically list the statement types
we can insert query parameters in.
can be
Show quoted text
regards, tom lane
On Fri, Mar 12, 2021 at 1:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Pavel Stehule <pavel.stehule@gmail.com> writes:
pá 12. 3. 2021 v 21:08 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I attach a v3 that I like better, although there's room to disagree
about that.I am not sure if people can understand the "optimizable command" term.
More
common categories are DML, DDL and SELECT. Maybe it is easier to say. DDL
statements don't support parametrizations, and then the variables cannotbe
used there.
Yeah, but DML/DDL is a pretty squishy separation as well, besides
which it'd mislead people for cases such as CREATE TABLE AS SELECT.
(Admittedly, I didn't mention that in my version either, but if you
think in terms of whether the optimizer will be applied then you
will draw the right conclusion.)
Related to an earlier email though, "CREATE TABLE AS SELECT" gets optimized
but "COPY (SELECT) TO" doesn't...
DML/DDL has the merit of being chapters 5 and 6 in the documentation (with
7 being SELECT).
I do agree that the delineation of "returns records or not" is not ideal
here. SELECT, then INSERT/UPDATE/DELETE (due to their shared RETURNING
dynamic), then "DML commands", then "DMS exceptions" (these last two
ideally leveraging the conceptual work noted above). That said, I do not
think this is such a big issue as to warrant that much of a rewrite. But
in lieu of that, and based upon responses given on the mailing lists,
"utility commands" seems preferable to optimizable commands. Defining,
either by name or by behavior, what utility commands are is needed though,
ideally outside of this chapter. Then a paragraph in the "no result"
section should list explicitly those utility commands that are an
exception, since they have an attached SELECT statement that does get
optimized.
Maybe in Chapter 4, with some forward references, some of this can be
covered and the exceptions to the rule (like CREATE TABLE AS) can be
mentioned.
To address your point about "utility commands", lacking an external
definition to link to, I would change it to be "everything except
INSERT/UPDATE/DELETE, which are described below, as well as EXPLAIN and
SELECT which are described in the next section". From there I like my
proposed flow into INSERT/UPDATE/DELETE w/o RETURNING, then from there the
RETURNING pointing forward to these being SELECT-like in behavior.
Adding a note about using EXECUTE works for me.
Calling EXPLAIN a utility command seems incorrect given that it behaves
just like a query. If it quacks like a duck...
What other row set returning commands are you considering as being utility?
Maybe there's no way out but to specifically list the statement types
we can insert query parameters in.
In the following I'm confused as to why "column reference" is specified
since those are not substituted:
"Parameters will only be substituted in places where a parameter or
column reference is syntactically allowed."
I'm not married to my explicit calling out of identifiers not being
substitutable but that does tend to be what people try to do.
I'm good with the Pl/SQL wording proposal.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
I do agree that the delineation of "returns records or not" is not ideal
here. SELECT, then INSERT/UPDATE/DELETE (due to their shared RETURNING
dynamic), then "DML commands", then "DMS exceptions" (these last two
ideally leveraging the conceptual work noted above). That said, I do not
think this is such a big issue as to warrant that much of a rewrite.
I took a stab at doing that, just to see what it might look like.
I thought it comes out pretty well, really -- see what you think.
(This still uses the terminology "optimizable statement", but I'm open
to replacing that with something else.)
In the following I'm confused as to why "column reference" is specified
since those are not substituted:
"Parameters will only be substituted in places where a parameter or
column reference is syntactically allowed."
The meaning of "column reference" there is, I think, a reference to
a column of a table being read by a query. In the counterexample
of "INSERT INTO mytable (col) ...", "col" cannot be replaced by a
data value. But in "INSERT INTO mytable (col) SELECT foo FROM bar",
"foo" is a candidate for replacement, even though it's likely meant
as a reference to bar.foo.
I'm not married to my explicit calling out of identifiers not being
substitutable but that does tend to be what people try to do.
The problem I had with it was that it didn't help clarify this
distinction. I'm certainly open to changes that do clarify that.
regards, tom lane
Attachments:
v4-doc-plpgsql-variable-usage-cleanup.patchtext/x-diff; charset=us-ascii; name=v4-doc-plpgsql-variable-usage-cleanup.patchDownload+72-36
I wrote:
I took a stab at doing that, just to see what it might look like.
I thought it comes out pretty well, really -- see what you think.
Hearing nothing further, I pushed that after another round of
copy-editing. There's still plenty of time to revise it if
anybody has further comments.
regards, tom lane