merging some features from plpgsql2 project
Hi
I reread ideas described on page https://github.com/trustly/plpgsql2
Some points are well and can be benefit for PlpgSQL.
First I describe my initial position. I am strongly against introduction
"new" language - plpgsql2 or new plpgsql, or any else. The trust of
developers to us is important and introduction of any not compatible or
different feature has to have really big reason. PostgreSQL is conservative
environment, and PLpgSQL should not be a exception. More - I have not any
information from my customers, colleagues about missing features in this
language. If there is some gaps, then it is in outer environment - IDE,
deployment, testing,
I understand so we have to do some break compatibility changes, but the
changes should be smoothly with possibility to simply identify necessary
changes, but better don't do it - and use other possibility.
i lost hope so plpgsql_check can be integrated to core. It is living
outside well - only preparing dll for MSWindows is annoyance. But some
checks from plpgsql_check can be implemented in core as extra_checks, and
some checks from plpgsql2 can be implemented in plpgsql_check.
Points from plpgsql2:
* SELECT .. INTO vs. TOO_MANY_ROWS - can be implemented as extra check
* SELECT .. INTO and the number of result columns - good extra check too
* EXECUTE and FOUND - this is incompatible change, extra check can be used
(test on unset variable). I see solution in leaving FOUND variable and
introduction of some new without this issue - ROW_COUNT maybe (this is
another possible incompatible change, but with higher benefit - maybe we
can introduce some aliasing, PRAGMA clause, default PRAGMAs, ..).
* SELECT .. INTO and := - looks bizarre, but I see clean benefit and I can
accept it
* The OUT namespace and OUT parameter visibility - I don't like it - not in
this form - can we introduce some form of namespace aliasing? The arguments
are in function name named namespace already.
Now, we doesn't use plpgsql_extra_check much and it is pity.
I thing so real question is support some macros, that can help with code
maintenance for different PostgreSQL versions. PostGIS code is nice example
what we are missing.
Regards
Pavel
2016-12-27 8:54 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
Hi
I reread ideas described on page https://github.com/trustly/plpgsql2
Some points are well and can be benefit for PlpgSQL.
First I describe my initial position. I am strongly against introduction
"new" language - plpgsql2 or new plpgsql, or any else. The trust of
developers to us is important and introduction of any not compatible or
different feature has to have really big reason. PostgreSQL is conservative
environment, and PLpgSQL should not be a exception. More - I have not any
information from my customers, colleagues about missing features in this
language. If there is some gaps, then it is in outer environment - IDE,
deployment, testing,
I forgot - the big plpgsql issue are too weak expressions on left part of
assignment statements.
Show quoted text
Pavel
On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi
I reread ideas described on page https://github.com/trustly/plpgsql2
Some points are well and can be benefit for PlpgSQL.
First I describe my initial position. I am strongly against introduction
"new" language - plpgsql2 or new plpgsql, or any else. The trust of
developers to us is important and introduction of any not compatible or
different feature has to have really big reason. PostgreSQL is conservative
environment, and PLpgSQL should not be a exception. More - I have not any
information from my customers, colleagues about missing features in this
language. If there is some gaps, then it is in outer environment - IDE,
deployment, testing,
Breaking language compatibility is a really big deal. There has to be
a lot of benefits to the effort and you have to make translation from
plpgsql1 to plpgsql2 really simple. You have made some good points on
the rationale but not nearly enough to justify implementation fork. So
basically I agree. Having said that, If you don't mind I'd like to
run with the topic (which I'm loosely interpreting as, "Things I'd
like to do in SQL/PLPGSQL and can't").
#1 problem with plpgsql in my point of view is that the language and
grammar are not supersets of sql. A lot of PLPGSQL keywords (EXECUTE,
BEGIN, INTO, END) have incompatible meanings with our SQL
implementation. IMNSHO, SQL ought to give the same behavior inside or
outside of plpgsql. It doesn't, and this is one of the reasons why
plpgsql may not be a good candidate for stored procedure
implementation.
#2 problem with plpgsql is after function entry it's too late to do
things like set transaction isolation level and change certain kinds
of variables (like statement_timeout). This is very obnoxious, I
can't wrap the database in an API 100%; the application has to manage
things that really should be controlled in SQL.
#3 problem with plpgsql is complete lack of inlining. inlining
function calls in postgres is a black art even for very trivial cases.
This makes it hard for us to write quick things and in the worst case
causes endless duplications of simple expressions.
In short I guess the issue is that we don't have stored procedures and
I don't see an easy path to getting there with the current language.
There are a lot of other little annoyances but most of them can be
solved without a compatibility break.
It would be pretty neat if postgres SQL implementation could directly
incorporate limited flow control and command execution. For example,
CREATE my_proc(Done OUT BOOL) RETURNS BOOL AS
$$
BEGIN;
SET transaction_isolation = 'serializable';
SELECT some_plpgsql_func_returning_bool();
COMMIT;
$$;
CALL my_proc() UNTIL Done;
Key points here are:
*) my_proc is in native SQL (not plpgsql), and run outside of snapshot
*) CALL is invocation into stored procedure. I extended it in similar
fashion as pl/sql CALL
(https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_4008.htm)
but anything will do for syntaxs as long as you get arbitrary control
of procedure lifetime external to snapshot and transaction
*) simple addition of UNTIL gets us out of the debate for best 'stored
procedure language'. Keeping things to pure SQL really simplifies
things since we already have statement parsing at tcop level. We just
need some special handling for CALL.
*) In my usage of plpgsql maybe 80% of database cases are covered
purely in language but maybe 20% of cases need support from
application typically where threading and transaction management is
involved. With the above it would be more like 95% would be covered
and if you extended CALL to something like:
CALL my_proc() IN BACKGROUND UNTIL Done;
..where "IN BACKGOUND" moved execution to a background worker one
could do just about everything in SQL in tasks that do nothing but
read and write to the database that today need significant support
from outside language (primarily bash for me).
With respect to stuff you mentioned, like smarter handling of INTO,
are you really sure you need to break compatibility for that?
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-12-27 23:56 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:Hi
I reread ideas described on page https://github.com/trustly/plpgsql2
Some points are well and can be benefit for PlpgSQL.
First I describe my initial position. I am strongly against introduction
"new" language - plpgsql2 or new plpgsql, or any else. The trust of
developers to us is important and introduction of any not compatible or
different feature has to have really big reason. PostgreSQL isconservative
environment, and PLpgSQL should not be a exception. More - I have not any
information from my customers, colleagues about missing features in this
language. If there is some gaps, then it is in outer environment - IDE,
deployment, testing,Breaking language compatibility is a really big deal. There has to be
a lot of benefits to the effort and you have to make translation from
plpgsql1 to plpgsql2 really simple. You have made some good points on
the rationale but not nearly enough to justify implementation fork. So
basically I agree. Having said that, If you don't mind I'd like to
run with the topic (which I'm loosely interpreting as, "Things I'd
like to do in SQL/PLPGSQL and can't").#1 problem with plpgsql in my point of view is that the language and
grammar are not supersets of sql. A lot of PLPGSQL keywords (EXECUTE,
BEGIN, INTO, END) have incompatible meanings with our SQL
implementation. IMNSHO, SQL ought to give the same behavior inside or
outside of plpgsql. It doesn't, and this is one of the reasons why
plpgsql may not be a good candidate for stored procedure
implementation.
There is little bit cleaner language for this purpose - SQL/PSM. But it is
hard to switch main language without big lost of reputation. I am not sure
about benefit.
#2 problem with plpgsql is after function entry it's too late to do
things like set transaction isolation level and change certain kinds
of variables (like statement_timeout). This is very obnoxious, I
can't wrap the database in an API 100%; the application has to manage
things that really should be controlled in SQL.
It is long story about implementation procedures - it is not related to
PLpgSQL - the language is not a issue.
#3 problem with plpgsql is complete lack of inlining. inlining
function calls in postgres is a black art even for very trivial cases.
This makes it hard for us to write quick things and in the worst case
causes endless duplications of simple expressions.
In short I guess the issue is that we don't have stored procedures and
I don't see an easy path to getting there with the current language.
There are a lot of other little annoyances but most of them can be
solved without a compatibility break.
I don't think so implementation of procedures will be simple, but I don't
see any issue in PLpgSQL.
It would be pretty neat if postgres SQL implementation could directly
incorporate limited flow control and command execution. For example,
CREATE my_proc(Done OUT BOOL) RETURNS BOOL AS
$$
BEGIN;
SET transaction_isolation = 'serializable';
SELECT some_plpgsql_func_returning_bool();
COMMIT;
$$;
CALL my_proc() UNTIL Done;Key points here are:
*) my_proc is in native SQL (not plpgsql), and run outside of snapshot
*) CALL is invocation into stored procedure. I extended it in similar
fashion as pl/sql CALL
(https://docs.oracle.com/cd/B19306_01/server.102/b14200/
statements_4008.htm)
but anything will do for syntaxs as long as you get arbitrary control
of procedure lifetime external to snapshot and transaction
*) simple addition of UNTIL gets us out of the debate for best 'stored
procedure language'. Keeping things to pure SQL really simplifies
things since we already have statement parsing at tcop level. We just
need some special handling for CALL.
*) In my usage of plpgsql maybe 80% of database cases are covered
purely in language but maybe 20% of cases need support from
application typically where threading and transaction management is
involved. With the above it would be more like 95% would be covered
and if you extended CALL to something like:
It is similar to my older proposals of stored procedures.
CALL my_proc() IN BACKGROUND UNTIL Done;
..where "IN BACKGOUND" moved execution to a background worker one
could do just about everything in SQL in tasks that do nothing but
read and write to the database that today need significant support
from outside language (primarily bash for me).With respect to stuff you mentioned, like smarter handling of INTO,
are you really sure you need to break compatibility for that?
I didn't propose any compatibility break.
Can we talk about another proposals separately, please. Stored procedures,
batch processing, different language are different topic.
Regards
Pavel
Show quoted text
merlin
On 12/27/16 4:56 PM, Merlin Moncure wrote:
On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
First I describe my initial position. I am strongly against introduction
"new" language - plpgsql2 or new plpgsql, or any else. The trust of
developers to us is important and introduction of any not compatible or
different feature has to have really big reason. PostgreSQL is conservative
environment, and PLpgSQL should not be a exception. More - I have not any
Which is why this is an external fork of plpgsql.
** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards
compatibility problems. **
This is NOT unique to plpgsql. plpython (for one) definitely has some
stupidity that will require an API break to fix.
A secondary issue is the lack of a blessed collection of extensions. If
we had that we could maintain some of this stuff outside of the core
release schedule, as well as provide more room for people to run
experimental versions of extensions if they desired. If we had this then
perhaps plpgsql_check would become a viable answer to some of this
(though IMHO plpgsql_check is just a work-around for our lack of dealing
with API compatibility).
information from my customers, colleagues about missing features in this
language. If there is some gaps, then it is in outer environment - IDE,
deployment, testing,
I'm honestly surprised (even shocked) that you've never run into any of
the problems plpgsql2 is trying to solve. I've hit all those problems
except for OUT parameters. I'd say the order they're listed in actually
corresponds to how often I hit the problems.
Breaking language compatibility is a really big deal. There has to be
a lot of benefits to the effort and you have to make translation from
plpgsql1 to plpgsql2 really simple. You have made some good points on
I think trying to move the ball forward in a meaningful way without
breaking compatibility is a lost cause. Some of these issues could be
addressed by adding more syntax, but even that has limits (do we really
want another variation of STRICT that allows only 0 or 1 rows?). And
there's no way to fix your #1 item below without breaking compatibility.
There *are* other ways this could be done, besides creating a different
PL. One immediate possibility is custom GUCs; there may be other options.
#1 problem with plpgsql in my point of view is that the language and
grammar are not supersets of sql. A lot of PLPGSQL keywords (EXECUTE,
BEGIN, INTO, END) have incompatible meanings with our SQL
implementation. IMNSHO, SQL ought to give the same behavior inside or
outside of plpgsql. It doesn't, and this is one of the reasons why
plpgsql may not be a good candidate for stored procedure
implementation.
While this doesn't bug me, it's got to be confusing as hell for newbies.
#2 problem with plpgsql is after function entry it's too late to do
things like set transaction isolation level and change certain kinds
of variables (like statement_timeout). This is very obnoxious, I
can't wrap the database in an API 100%; the application has to manage
things that really should be controlled in SQL.
+1
#3 problem with plpgsql is complete lack of inlining. inlining
function calls in postgres is a black art even for very trivial cases.
This makes it hard for us to write quick things and in the worst case
causes endless duplications of simple expressions.
Instead of banging our heads against the fmgr API to try and solve this,
I suspect it would be much simpler (and easier to understand) if we had
the equivalent to a #define for queries. The fmgr API just isn't
amenable to trying to inline stuff. This would allow you to define
things like views that accept arguments, so you can shove the argument
way down in the guts of the query without getting tripped up by fences.
Here's some other plpgsql pain-points (though, not all of these require
an API break):
#4: it's impossible to operate on a Datum in-place. Though, maybe the
work Tom did with ExpandedObjects eliminates some of this problem, but
if it does it's hidden behind the existing syntax and you have no way to
know it (and AFAICT the only thing using that infrastructure right now
is arrays). Aside from the performance aspects, it'd be damn nice to be
able to do things like ++, +=, etc.
#5: handling of parameter name collisions still sucks. One way to
improve this would be to put parameters inside the outer-most statement
block, so you could use a block name that was different from the
function name. Something else that might help is the ability to assign a
namespace for query identifiers, so you don't have to alias every
individual relation in a query.
#6: The variations of syntax between the FOR variants is annoying
(specifically, FOREACH necessitating the ARRAY keyword).
#7: = vs := vs INTO. = can do everything the others can do except for
STRICT, and when it comes to STRICT I actually wish we had language
support for whether 0 or >1 rows are allowed. I've wanted that in the
past for views, and if we had that then you'd be able to use it in SQL
functions as well. If that's not possible then we should fid some other
way to handle this in plpgsql, because STRICT is often too broad.
#8: EVERYTHING command option should accept a variable. In particular,
RAISE should accept a variable for level, but there's other cases of
this I've run into. I'd also be nice if you could plop variables into
SQL commands where you'd have an identifier, though presumably that
would require some kind of explicit variable identifier.
#9: You should be able to modify an exception before re-raising it.
#10: Exception information should be passed around as a composite.
#11: Composite support is very lacking. There's no easy way to get a
list of what fields exist in a composite, let alone do something generic
to some set of them. There are ways to work around this, but they're
very tedious and ugly.
#12: It'd be nice if any was allowed, as there are operations that can
apply to more than one class of data type.
#13: cstring support would allow a lot more people to experiment with
things like custom types. Yes, plpgsql might be slow as hell for this,
but sometimes that doesn't matter. Even if it does, it can be a lot
easier to prototype in something other than C. (Granted, I think there's
some non-plpgsql stuff that would need to happen to allow this.)
In short I guess the issue is that we don't have stored procedures and
I don't see an easy path to getting there with the current language.
There are a lot of other little annoyances but most of them can be
solved without a compatibility break.
Well, actual stored *procedures* is an entirely different problem, which
(again) fmgr is absolutely not designed to handle. All the PL handlers
that I've looked at have completely in-grained the notion that they're
running inside a transaction, so it would be a lot of work to try and
change that. While there may be some plpgsql-specific problems with it
supporting stored procs, there are much bigger questions to answer
before worrying about that.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-12-28 5:09 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 12/27/16 4:56 PM, Merlin Moncure wrote:
On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:First I describe my initial position. I am strongly against introduction
"new" language - plpgsql2 or new plpgsql, or any else. The trust of
developers to us is important and introduction of any not compatible or
different feature has to have really big reason. PostgreSQL is
conservative
environment, and PLpgSQL should not be a exception. More - I have not anyWhich is why this is an external fork of plpgsql.
ok. Just I would not to repeat Perl6 or Python3 story - it is big
adventure, but big fail too
** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards compatibility
problems. **
We have not, but there are few possibilities:
1. enhance #option command
2. we can introduce PRAGMA command
https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas
This is NOT unique to plpgsql. plpython (for one) definitely has some
stupidity that will require an API break to fix.A secondary issue is the lack of a blessed collection of extensions. If we
had that we could maintain some of this stuff outside of the core release
schedule, as well as provide more room for people to run experimental
versions of extensions if they desired. If we had this then perhaps
plpgsql_check would become a viable answer to some of this (though IMHO
plpgsql_check is just a work-around for our lack of dealing with API
compatibility).
plpgsql_check can do some test, that are impossible in plpgsql - from
performance view, from features. But some "blessed collections of
extension" can be nice. More if will be joined with some automatic test and
build tools. Although lot of extensions are really mature, the knowleadge
about these extensions are minimal - and building extensions on windows is
hard work still (for Linux developer).
information from my customers, colleagues about missing features in this
language. If there is some gaps, then it is in outer environment - IDE,
deployment, testing,I'm honestly surprised (even shocked) that you've never run into any of
the problems plpgsql2 is trying to solve. I've hit all those problems
except for OUT parameters. I'd say the order they're listed in actually
corresponds to how often I hit the problems.
I hit lot of older harder (now solved) issues - now, with more experience I
am able to see these issues. And I wrote plpgsql_check, partially for self
too. Years ago I prefer safe expressions.
Breaking language compatibility is a really big deal. There has to be
a lot of benefits to the effort and you have to make translation from
plpgsql1 to plpgsql2 really simple. You have made some good points onI think trying to move the ball forward in a meaningful way without
breaking compatibility is a lost cause. Some of these issues could be
addressed by adding more syntax, but even that has limits (do we really
want another variation of STRICT that allows only 0 or 1 rows?). And
there's no way to fix your #1 item below without breaking compatibility.
I think so there is way with extra check, or with persistent plpgsql
options - just use it, please. Some checks are clear, some other not.
There *are* other ways this could be done, besides creating a different
PL. One immediate possibility is custom GUCs; there may be other options.#1 problem with plpgsql in my point of view is that the language and
grammar are not supersets of sql. A lot of PLPGSQL keywords (EXECUTE,
BEGIN, INTO, END) have incompatible meanings with our SQL
implementation. IMNSHO, SQL ought to give the same behavior inside or
outside of plpgsql. It doesn't, and this is one of the reasons why
plpgsql may not be a good candidate for stored procedure
implementation.While this doesn't bug me, it's got to be confusing as hell for newbies.
If you know ALGOL family languages, then it is not problem. What is a
harder problem for people is different implementation of mix SQL and PL -
different than Oracle, or MSSQL. Our model is better, simpler but
different. It is difficult for people without knowleadge of differences
between functions and procedures. Partially we badly speaking so our void
functions are procedures.
#2 problem with plpgsql is after function entry it's too late to do
things like set transaction isolation level and change certain kinds
of variables (like statement_timeout). This is very obnoxious, I
can't wrap the database in an API 100%; the application has to manage
things that really should be controlled in SQL.+1
#3 problem with plpgsql is complete lack of inlining. inlining
function calls in postgres is a black art even for very trivial cases.
This makes it hard for us to write quick things and in the worst case
causes endless duplications of simple expressions.Instead of banging our heads against the fmgr API to try and solve this, I
suspect it would be much simpler (and easier to understand) if we had the
equivalent to a #define for queries. The fmgr API just isn't amenable to
trying to inline stuff. This would allow you to define things like views
that accept arguments, so you can shove the argument way down in the guts
of the query without getting tripped up by fences.Here's some other plpgsql pain-points (though, not all of these require an
API break):#4: it's impossible to operate on a Datum in-place. Though, maybe the work
Tom did with ExpandedObjects eliminates some of this problem, but if it
does it's hidden behind the existing syntax and you have no way to know it
(and AFAICT the only thing using that infrastructure right now is arrays).
Aside from the performance aspects, it'd be damn nice to be able to do
things like ++, +=, etc.#5: handling of parameter name collisions still sucks. One way to improve
this would be to put parameters inside the outer-most statement block, so
you could use a block name that was different from the function name.
Something else that might help is the ability to assign a namespace for
query identifiers, so you don't have to alias every individual relation in
a query.#6: The variations of syntax between the FOR variants is annoying
(specifically, FOREACH necessitating the ARRAY keyword).
this is design - FOR is old PL/SQL syntax. FOREACH is prepared for
extending
#7: = vs := vs INTO. = can do everything the others can do except for
STRICT, and when it comes to STRICT I actually wish we had language support
for whether 0 or >1 rows are allowed. I've wanted that in the past for
views, and if we had that then you'd be able to use it in SQL functions as
well. If that's not possible then we should fid some other way to handle
this in plpgsql, because STRICT is often too broad.#8: EVERYTHING command option should accept a variable. In particular,
RAISE should accept a variable for level, but there's other cases of this
I've run into. I'd also be nice if you could plop variables into SQL
commands where you'd have an identifier, though presumably that would
require some kind of explicit variable identifier.
It is hiding dynamic SQL - I am strongly against it - minimally due
performance issues. Important functionality should not be hidden.
#9: You should be able to modify an exception before re-raising it.
#10: Exception information should be passed around as a composite
I have not any problem with last two points
.
#11: Composite support is very lacking. There's no easy way to get a list
of what fields exist in a composite, let alone do something generic to some
set of them. There are ways to work around this, but they're very tedious
and ugly.
+1
#12: It'd be nice if any was allowed, as there are operations that can
apply to more than one class of data type.#13: cstring support would allow a lot more people to experiment with
things like custom types. Yes, plpgsql might be slow as hell for this, but
sometimes that doesn't matter. Even if it does, it can be a lot easier to
prototype in something other than C. (Granted, I think there's some
non-plpgsql stuff that would need to happen to allow this.)
Not sure about it (I have really realy wrong experience with some
developers about performance) - but PLPython, PLPerl can do it well, and I
miss some possibility - We can use transformations more time - SQL/MM is
based on new datatypes and transformations.
In short I guess the issue is that we don't have stored procedures and
I don't see an easy path to getting there with the current language.
There are a lot of other little annoyances but most of them can be
solved without a compatibility break.Well, actual stored *procedures* is an entirely different problem, which
(again) fmgr is absolutely not designed to handle. All the PL handlers that
I've looked at have completely in-grained the notion that they're running
inside a transaction, so it would be a lot of work to try and change that.
While there may be some plpgsql-specific problems with it supporting stored
procs, there are much bigger questions to answer before worrying about that.
yes. The design of transaction controlling inside stored procedures is hard
work not related to any PL. Some can be partially solved by functions
executed in autonomous transactions. With background workers we can
implement asynchronous autonomous transactions - what can coverage lot of
use cases where transaction controlling should be used in other databases.
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
On 12/28/16 7:16 AM, Pavel Stehule wrote:
** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards
compatibility problems. **We have not, but there are few possibilities:
1. enhance #option command
2. we can introduce PRAGMA command
https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas
<https://en.wikipedia.org/wiki/Ada_%28programming_language%29#Pragmas>
I wanted to break this out separately, because IMO it's the real heart
of the matter.
I think it would be silly not to allow a global setting of
compatibility. You certainly don't want to force people to stick magic
keywords in their code forevermore.
To that end, would GUCs be a workable answer here? That should give you
the ability to control incompatibilities at a function, user, database
and global level. It would also allow you to chose between raising a
WARNING vs a FATAL.
I realize we've had some bad experiences with compatibility GUCs in the
past, but I'd argue we've also had some good experiences. I see that
add_missing_from is now completely gone, for example, presumably with no
complaints. There's probably several other compatibility GUCs we could
remove now.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/28/16 7:16 AM, Pavel Stehule wrote:
2016-12-28 5:09 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com
<mailto:Jim.Nasby@bluetreble.com>>:On 12/27/16 4:56 PM, Merlin Moncure wrote:
On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule
<pavel.stehule@gmail.com <mailto:pavel.stehule@gmail.com>> wrote:
Which is why this is an external fork of plpgsql.ok. Just I would not to repeat Perl6 or Python3 story - it is big
adventure, but big fail too
Yeah, creating an entirely "new" PL to deal with compatibility doesn't
seem like a good idea to me.
** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards
compatibility problems. **We have not, but there are few possibilities:
1. enhance #option command
2. we can introduce PRAGMA command
https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas
See separate reply.
<discussion about "blessed extensions">
I'm honestly surprised (even shocked) that you've never run into any
of the problems plpgsql2 is trying to solve. I've hit all those
problems except for OUT parameters. I'd say the order they're listed
in actually corresponds to how often I hit the problems.I hit lot of older harder (now solved) issues - now, with more
experience I am able to see these issues. And I wrote plpgsql_check,
partially for self too. Years ago I prefer safe expressions.
Recognizing a problem ahead of time (or having plpgsql_check do it for
you) still means you have to find a way to work around it. In some cases
(ie: STRICT), that workaround can be a serious PITA. Better to just
eliminate the problem itself.
I think trying to move the ball forward in a meaningful way without
breaking compatibility is a lost cause. Some of these issues could
be addressed by adding more syntax, but even that has limits (do we
really want another variation of STRICT that allows only 0 or 1
rows?). And there's no way to fix your #1 item below without
breaking compatibility.I think so there is way with extra check, or with persistent plpgsql
options - just use it, please. Some checks are clear, some other not.
I will assert that there will ALWAYS be problems that you can't plaster
over with some kind of extra checking (like plpgsql_check). At some
point, in order to fix those, you have to somehow break compatibility.
Look at libpq as an example. There's a reason we're on protocol V3.
If you know ALGOL family languages, then it is not problem. What is a
Lets be realistic... what % of our users have even heard of ALGOL, let
alone used it? :)
harder problem for people is different implementation of mix SQL and PL
- different than Oracle, or MSSQL. Our model is better, simpler but
different. It is difficult for people without knowleadge of differences
between functions and procedures. Partially we badly speaking so our
void functions are procedures.
I suspect that's only confusing for people coming from Oracle (which of
course is a non-trivial number of people).
#6: The variations of syntax between the FOR variants is annoying
(specifically, FOREACH necessitating the ARRAY keyword).this is design - FOR is old PL/SQL syntax. FOREACH is prepared for
extending
Understood. It still sucks though. :)
#8: EVERYTHING command option should accept a variable. In
particular, RAISE should accept a variable for level, but there's
other cases of this I've run into. I'd also be nice if you could
plop variables into SQL commands where you'd have an identifier,
though presumably that would require some kind of explicit variable
identifier.It is hiding dynamic SQL - I am strongly against it - minimally due
performance issues. Important functionality should not be hidden.
There's definitely ways around the performance issue. I do agree that it
needs to be clear when you're doing something dynamic so it's not
accidental. One way to do that would be to add support for variable
decorators and mandate the use of decorators when using a variable for
an identifier.
That said, *every* option to RAISE can be a variable except the level.
That's just plain silly and should be fixed.
#13: cstring support would allow a lot more people to experiment
with things like custom types. Yes, plpgsql might be slow as hell
for this, but sometimes that doesn't matter. Even if it does, it can
be a lot easier to prototype in something other than C. (Granted, I
think there's some non-plpgsql stuff that would need to happen to
allow this.)Not sure about it (I have really realy wrong experience with some
developers about performance) - but PLPython, PLPerl can do it well, and
I miss some possibility - We can use transformations more time - SQL/MM
is based on new datatypes and transformations.
Well, there's probably some other things that could be done to make
plpgsql perform better in this regard. One thing I've wondered about is
allowing array-like access to a plain string (or in this case, cstring).
That would allow you to write code that would translate much better into
fast C code. IE: if you needed to scan through an entire string you
could do something like for (i=0; i<strlen(); i++).
yes. The design of transaction controlling inside stored procedures is
hard work not related to any PL. Some can be partially solved by
functions executed in autonomous transactions. With background workers
we can implement asynchronous autonomous transactions - what can
coverage lot of use cases where transaction controlling should be used
in other databases.
Well, those are all really hacks around a fundamental problem of
allowing user-defined, pre-stored code to execute outside of a
transaction. I don't think solving that is necessarily *that* hard
(though certainly non-trivial), but the fmgr interface is certainly NOT
the right way to go about that. In any case, that's a completely
different discussion.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-12-28 18:54 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 12/28/16 7:16 AM, Pavel Stehule wrote:
** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards
compatibility problems. **We have not, but there are few possibilities:
1. enhance #option command
2. we can introduce PRAGMA command
https://en.wikipedia.org/wiki/Ada_(programming_language)#Pragmas
<https://en.wikipedia.org/wiki/Ada_%28programming_language%29#Pragmas>I wanted to break this out separately, because IMO it's the real heart of
the matter.I think it would be silly not to allow a global setting of compatibility.
You certainly don't want to force people to stick magic keywords in their
code forevermore.To that end, would GUCs be a workable answer here? That should give you
the ability to control incompatibilities at a function, user, database and
global level. It would also allow you to chose between raising a WARNING vs
a FATAL.
GUC are fragile - the source code and settings can be separated.
Our #option is more robust, because source code holds all flags required
for execution. So I would to see a mechanism, that will be strongly joined
with code.
Using function assigned GUC is similar, but it is looking less robust - and
some editors can forgot this information.
Lot of issues we can solved by plpgsq.extra_error, extra_warnings - but
probably not all - for example issue of FOUND variable or introducing new
auto variable ROW_COUNT. PLpgSQL - PL/SQL is safe - it propose the
statement GET DIAGNOSTICS, but I understand so isn't funny to write more
and more GET DIAGNOSTICS rc = ROW_COUNT; So some shortcuts can be nice, but
there is risk, so this shortcut breaks existing code, and the
costs/benefits are individual. There cannot be 100% agreement ever. So some
customisation should be good.
Show quoted text
I realize we've had some bad experiences with compatibility GUCs in the
past, but I'd argue we've also had some good experiences. I see that
add_missing_from is now completely gone, for example, presumably with no
complaints. There's probably several other compatibility GUCs we could
remove now.--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
On 12/28/16 12:15 PM, Pavel Stehule wrote:
GUC are fragile - the source code and settings can be separated.
*Can* be, but they don't *have* to be. That's a huge feature, not a bug.
Our #option is more robust, because source code holds all flags required
for execution. So I would to see a mechanism, that will be strongly
joined with code.
That means you must ALWAYS specify, which is an enormous pain. It
basically guarantees that users will NEVER switch to the new syntax.
Using function assigned GUC is similar, but it is looking less robust -
and some editors can forgot this information.
If you forget then you get an error. Then you remember.
Lot of issues we can solved by plpgsq.extra_error, extra_warnings - but
probably not all - for example issue of FOUND variable or introducing
new auto variable ROW_COUNT. PLpgSQL - PL/SQL is safe - it propose the
statement GET DIAGNOSTICS, but I understand so isn't funny to write more
and more GET DIAGNOSTICS rc = ROW_COUNT; So some shortcuts can be nice,
but there is risk, so this shortcut breaks existing code, and the
costs/benefits are individual. There cannot be 100% agreement ever. So
some customisation should be good.
That's the whole point of having settings to deal with
incompatibilities: so we can actually fix these warts without breaking
everyone's code, yet also make it clear to users that they should stop
using the warts and instead use the new and improved syntax.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-12-28 19:23 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 12/28/16 12:15 PM, Pavel Stehule wrote:
GUC are fragile - the source code and settings can be separated.
*Can* be, but they don't *have* to be. That's a huge feature, not a bug.
Our #option is more robust, because source code holds all flags required
for execution. So I would to see a mechanism, that will be strongly
joined with code.That means you must ALWAYS specify, which is an enormous pain. It
basically guarantees that users will NEVER switch to the new syntax.Using function assigned GUC is similar, but it is looking less robust -
and some editors can forgot this information.
If you forget then you get an error. Then you remember.
Lot of issues we can solved by plpgsq.extra_error, extra_warnings - but
probably not all - for example issue of FOUND variable or introducing
new auto variable ROW_COUNT. PLpgSQL - PL/SQL is safe - it propose the
statement GET DIAGNOSTICS, but I understand so isn't funny to write more
and more GET DIAGNOSTICS rc = ROW_COUNT; So some shortcuts can be nice,
but there is risk, so this shortcut breaks existing code, and the
costs/benefits are individual. There cannot be 100% agreement ever. So
some customisation should be good.That's the whole point of having settings to deal with incompatibilities:
so we can actually fix these warts without breaking everyone's code, yet
also make it clear to users that they should stop using the warts and
instead use the new and improved syntax.
Now, the incompatibility can be hard issue - it is big question if we lock
some users on old versions because some users can save to lines of code.
Introduction of ROW_COUNT is lowly incompatibility - it can be simply
detected - but for example change of behave of FOUND variable is terrible,
because the code will be quietly calculate differently. sometimes we can
break code - probably people will not be happy, but sometimes we can change
the results - it can be big fail. So on one side is big costs. On second
side is few lines less code.
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
2016-12-28 18:54 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 12/28/16 7:16 AM, Pavel Stehule wrote:
2016-12-28 5:09 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com
<mailto:Jim.Nasby@bluetreble.com>>:On 12/27/16 4:56 PM, Merlin Moncure wrote:
On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule
<pavel.stehule@gmail.com <mailto:pavel.stehule@gmail.com>> wrote:
Which is why this is an external fork of plpgsql.ok. Just I would not to repeat Perl6 or Python3 story - it is big
adventure, but big fail tooYeah, creating an entirely "new" PL to deal with compatibility doesn't
seem like a good idea to me.** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards
compatibility problems. **We have not, but there are few possibilities:
1. enhance #option command
2. we can introduce PRAGMA command
https://en.wikipedia.org/wiki/Ada_(programming_language)#PragmasSee separate reply.
<discussion about "blessed extensions">
I'm honestly surprised (even shocked) that you've never run into any
of the problems plpgsql2 is trying to solve. I've hit all those
problems except for OUT parameters. I'd say the order they're listed
in actually corresponds to how often I hit the problems.I hit lot of older harder (now solved) issues - now, with more
experience I am able to see these issues. And I wrote plpgsql_check,
partially for self too. Years ago I prefer safe expressions.Recognizing a problem ahead of time (or having plpgsql_check do it for
you) still means you have to find a way to work around it. In some cases
(ie: STRICT), that workaround can be a serious PITA. Better to just
eliminate the problem itself.I think trying to move the ball forward in a meaningful way without
breaking compatibility is a lost cause. Some of these issues could
be addressed by adding more syntax, but even that has limits (do we
really want another variation of STRICT that allows only 0 or 1
rows?). And there's no way to fix your #1 item below without
breaking compatibility.I think so there is way with extra check, or with persistent plpgsql
options - just use it, please. Some checks are clear, some other not.I will assert that there will ALWAYS be problems that you can't plaster
over with some kind of extra checking (like plpgsql_check). At some point,
in order to fix those, you have to somehow break compatibility.
Look at libpq as an example. There's a reason we're on protocol V3.
If you know ALGOL family languages, then it is not problem. What is a
Lets be realistic... what % of our users have even heard of ALGOL, let
alone used it? :)
not too much - but the problem is not in BEGIN, END. I wrote PL/PSM where
BEGIN END doesn't exists. The functionality was same as PLpgSQL - and there
was not anybody who use it.
harder problem for people is different implementation of mix SQL and PL
- different than Oracle, or MSSQL. Our model is better, simpler but
different. It is difficult for people without knowleadge of differences
between functions and procedures. Partially we badly speaking so our
void functions are procedures.I suspect that's only confusing for people coming from Oracle (which of
course is a non-trivial number of people).#6: The variations of syntax between the FOR variants is annoying
(specifically, FOREACH necessitating the ARRAY keyword).
this is design - FOR is old PL/SQL syntax. FOREACH is prepared for
extendingUnderstood. It still sucks though. :)
#8: EVERYTHING command option should accept a variable. In
particular, RAISE should accept a variable for level, but there's
other cases of this I've run into. I'd also be nice if you could
plop variables into SQL commands where you'd have an identifier,
though presumably that would require some kind of explicit variable
identifier.It is hiding dynamic SQL - I am strongly against it - minimally due
performance issues. Important functionality should not be hidden.There's definitely ways around the performance issue. I do agree that it
needs to be clear when you're doing something dynamic so it's not
accidental. One way to do that would be to add support for variable
decorators and mandate the use of decorators when using a variable for an
identifier.That said, *every* option to RAISE can be a variable except the level.
That's just plain silly and should be fixed.
I am sorry - I read it wrong - If there is not a parser issue, then it can
be fixed simply.
#13: cstring support would allow a lot more people to experiment
with things like custom types. Yes, plpgsql might be slow as hell
for this, but sometimes that doesn't matter. Even if it does, it can
be a lot easier to prototype in something other than C. (Granted, I
think there's some non-plpgsql stuff that would need to happen to
allow this.)Not sure about it (I have really realy wrong experience with some
developers about performance) - but PLPython, PLPerl can do it well, and
I miss some possibility - We can use transformations more time - SQL/MM
is based on new datatypes and transformations.Well, there's probably some other things that could be done to make
plpgsql perform better in this regard. One thing I've wondered about is
allowing array-like access to a plain string (or in this case, cstring).
That would allow you to write code that would translate much better into
fast C code. IE: if you needed to scan through an entire string you could
do something like for (i=0; i<strlen(); i++).
It is fixable - and faster string operations can be nice feature - Just I
am not volunteer for this task :)
yes. The design of transaction controlling inside stored procedures is
hard work not related to any PL. Some can be partially solved by
functions executed in autonomous transactions. With background workers
we can implement asynchronous autonomous transactions - what can
coverage lot of use cases where transaction controlling should be used
in other databases.Well, those are all really hacks around a fundamental problem of allowing
user-defined, pre-stored code to execute outside of a transaction. I don't
think solving that is necessarily *that* hard (though certainly
non-trivial), but the fmgr interface is certainly NOT the right way to go
about that. In any case, that's a completely different discussion.
This points needs real use cases. Our current design is limited, but it
isn't bad - It is much better than Oracle or MSSQL design - we can talk
what use cases we can solve. I am sure, so we can find good mapping for
PLpgSQL for any design.
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
On 12/28/16 12:51 PM, Pavel Stehule wrote:
Now, the incompatibility can be hard issue - it is big question if we
lock some users on old versions because some users can save to lines of
code. Introduction of ROW_COUNT is lowly incompatibility - it can be
simply detected - but for example change of behave of FOUND variable is
terrible, because the code will be quietly calculate differently.
sometimes we can break code - probably people will not be happy, but
sometimes we can change the results - it can be big fail. So on one side
is big costs. On second side is few lines less code.
That's my whole point of why this needs to be settable at a global
level: so that people with a lot of legacy code can set the OLD behavior
at a global level, and deal with the old code over time.
If there's no global setting then there are only two choices: we default
to new behavior and force everyone to add a bunch of stuff to *every*
function they have (loads of complaints), or we default to old behavior
and no one bothers to even adopt the new usage because they have to add
extra stuff to every function. Either way is a failure. This is why I
think there MUST be some way to control this at a higher level than per
function.
Certainly GUCs aren't the only option, we could invent something else.
One feature I could see being useful is being able to set a default on a
schema level, which isn't currently possible with a GUC. But I can
certainly see database and global settings being useful, and perhaps
per-user as well. GUCs already have those.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2016-12-28 20:25 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 12/28/16 12:51 PM, Pavel Stehule wrote:
Now, the incompatibility can be hard issue - it is big question if we
lock some users on old versions because some users can save to lines of
code. Introduction of ROW_COUNT is lowly incompatibility - it can be
simply detected - but for example change of behave of FOUND variable is
terrible, because the code will be quietly calculate differently.
sometimes we can break code - probably people will not be happy, but
sometimes we can change the results - it can be big fail. So on one side
is big costs. On second side is few lines less code.That's my whole point of why this needs to be settable at a global level:
so that people with a lot of legacy code can set the OLD behavior at a
global level, and deal with the old code over time.If there's no global setting then there are only two choices: we default
to new behavior and force everyone to add a bunch of stuff to *every*
function they have (loads of complaints), or we default to old behavior and
no one bothers to even adopt the new usage because they have to add extra
stuff to every function. Either way is a failure. This is why I think there
MUST be some way to control this at a higher level than per function.
we can have both - plpgsql.variable_conflict can be precedent.
Certainly GUCs aren't the only option, we could invent something else. One
feature I could see being useful is being able to set a default on a schema
level, which isn't currently possible with a GUC. But I can certainly see
database and global settings being useful, and perhaps per-user as well.
GUCs already have those.
yes, without GUC you cannot set the behave of plpgsql globally.
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
Hi
I wrote some initial patch
Do you think so has sense to continue in this topic?
Regards
Pavel
Attachments:
plpgsql_extra_runtime_checks.patchtext/x-patch; charset=US-ASCII; name=plpgsql_extra_runtime_checks.patchDownload
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 77e7440..15b867fa 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -3616,6 +3616,24 @@ exec_stmt_execsql(PLpgSQL_execstate *estate,
long tcount;
int rc;
PLpgSQL_expr *expr = stmt->sqlstmt;
+ bool too_many_rows_check;
+ int too_many_rows_level;
+
+ if (plpgsql_extra_errors & PLPGSQL_XCHECK_TOOMANYROWS)
+ {
+ too_many_rows_check = true;
+ too_many_rows_level = ERROR;
+ }
+ else if (plpgsql_extra_warnings & PLPGSQL_XCHECK_TOOMANYROWS)
+ {
+ too_many_rows_check = true;
+ too_many_rows_level = WARNING;
+ }
+ else
+ {
+ too_many_rows_check = false;
+ too_many_rows_level = NOTICE;
+ }
/*
* On the first call for this statement generate the plan, and detect
@@ -3666,7 +3684,7 @@ exec_stmt_execsql(PLpgSQL_execstate *estate,
*/
if (stmt->into)
{
- if (stmt->strict || stmt->mod_stmt)
+ if (stmt->strict || stmt->mod_stmt || too_many_rows_check)
tcount = 2;
else
tcount = 1;
@@ -3786,7 +3804,7 @@ exec_stmt_execsql(PLpgSQL_execstate *estate,
}
else
{
- if (n > 1 && (stmt->strict || stmt->mod_stmt))
+ if (n > 1 && (stmt->strict || stmt->mod_stmt || too_many_rows_check))
{
char *errdetail;
@@ -3795,7 +3813,7 @@ exec_stmt_execsql(PLpgSQL_execstate *estate,
else
errdetail = NULL;
- ereport(ERROR,
+ ereport(too_many_rows_level == WARNING && !stmt->strict ? WARNING : ERROR,
(errcode(ERRCODE_TOO_MANY_ROWS),
errmsg("query returned more than one row"),
errdetail ? errdetail_internal("parameters: %s", errdetail) : 0));
@@ -6009,12 +6027,48 @@ exec_move_row(PLpgSQL_execstate *estate,
int t_natts;
int fnum;
int anum;
+ bool strict_multiassignment_check;
+ int strict_multiassignment_level;
+
+ if (plpgsql_extra_errors & PLPGSQL_XCHECK_STRICTMULTIASSIGNMENT)
+ {
+ strict_multiassignment_check = true;
+ strict_multiassignment_level = ERROR;
+ }
+ else if (plpgsql_extra_warnings & PLPGSQL_XCHECK_STRICTMULTIASSIGNMENT)
+ {
+ strict_multiassignment_check = true;
+ strict_multiassignment_level = WARNING;
+ }
+ else
+ {
+ strict_multiassignment_check = false;
+ strict_multiassignment_level = NOTICE;
+ }
if (HeapTupleIsValid(tup))
t_natts = HeapTupleHeaderGetNatts(tup->t_data);
else
t_natts = 0;
+ if (strict_multiassignment_check)
+ {
+ int i;
+
+ anum = 0;
+ for (i = 0; i < td_natts; i++)
+ if (!tupdesc->attrs[i]->attisdropped)
+ anum++;
+
+ if (anum != row->nfields)
+ {
+ ereport(strict_multiassignment_level,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("Number of evaluated attributies (%d) does not match expected attributies (%d)",
+ anum, row->nfields)));
+ }
+ }
+
anum = 0;
for (fnum = 0; fnum < row->nfields; fnum++)
{
diff --git a/src/pl/plpgsql/src/pl_handler.c b/src/pl/plpgsql/src/pl_handler.c
index 36868fb..09bec86 100644
--- a/src/pl/plpgsql/src/pl_handler.c
+++ b/src/pl/plpgsql/src/pl_handler.c
@@ -89,6 +89,10 @@ plpgsql_extra_checks_check_hook(char **newvalue, void **extra, GucSource source)
if (pg_strcasecmp(tok, "shadowed_variables") == 0)
extrachecks |= PLPGSQL_XCHECK_SHADOWVAR;
+ else if (pg_strcasecmp(tok, "too_many_rows") == 0)
+ extrachecks |= PLPGSQL_XCHECK_TOOMANYROWS;
+ else if (pg_strcasecmp(tok, "strict_multi_assignment") == 0)
+ extrachecks |= PLPGSQL_XCHECK_STRICTMULTIASSIGNMENT;
else if (pg_strcasecmp(tok, "all") == 0 || pg_strcasecmp(tok, "none") == 0)
{
GUC_check_errdetail("Key word \"%s\" cannot be combined with other key words.", tok);
diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index c84a97b..820afe4 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -1025,7 +1025,9 @@ extern bool plpgsql_check_asserts;
/* extra compile-time checks */
#define PLPGSQL_XCHECK_NONE 0
-#define PLPGSQL_XCHECK_SHADOWVAR 1
+#define PLPGSQL_XCHECK_SHADOWVAR (1 << 1)
+#define PLPGSQL_XCHECK_TOOMANYROWS (1 << 2)
+#define PLPGSQL_XCHECK_STRICTMULTIASSIGNMENT (1 << 3)
#define PLPGSQL_XCHECK_ALL ((int) ~0)
extern int plpgsql_extra_warnings;
diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out
index 79513e4..b09e83a 100644
--- a/src/test/regress/expected/plpgsql.out
+++ b/src/test/regress/expected/plpgsql.out
@@ -3422,6 +3422,54 @@ select shadowtest(1);
t
(1 row)
+-- runtime extra checks
+set plpgsql.extra_warnings to 'too_many_rows';
+do $$
+declare x int;
+begin
+ select v from generate_series(1,2) g(v) into x;
+end;
+$$;
+WARNING: query returned more than one row
+set plpgsql.extra_errors to 'too_many_rows';
+do $$
+declare x int;
+begin
+ select v from generate_series(1,2) g(v) into x;
+end;
+$$;
+ERROR: query returned more than one row
+CONTEXT: PL/pgSQL function inline_code_block line 4 at SQL statement
+reset plpgsql.extra_errors;
+reset plpgsql.extra_warnings;
+set plpgsql.extra_warnings to 'strict_multi_assignment';
+do $$
+declare
+ x int;
+ y int;
+begin
+ select 1 into x, y;
+ select 1,2 into x, y;
+ select 1,2,3 into x, y;
+end
+$$;
+WARNING: Number of evaluated attributies (1) does not match expected attributies (2)
+WARNING: Number of evaluated attributies (3) does not match expected attributies (2)
+set plpgsql.extra_errors to 'strict_multi_assignment';
+do $$
+declare
+ x int;
+ y int;
+begin
+ select 1 into x, y;
+ select 1,2 into x, y;
+ select 1,2,3 into x, y;
+end
+$$;
+ERROR: Number of evaluated attributies (1) does not match expected attributies (2)
+CONTEXT: PL/pgSQL function inline_code_block line 6 at SQL statement
+reset plpgsql.extra_errors;
+reset plpgsql.extra_warnings;
-- test scrollable cursor support
create function sc_test() returns setof integer as $$
declare
diff --git a/src/test/regress/sql/plpgsql.sql b/src/test/regress/sql/plpgsql.sql
index 877d3ad..aa47e93 100644
--- a/src/test/regress/sql/plpgsql.sql
+++ b/src/test/regress/sql/plpgsql.sql
@@ -2840,6 +2840,57 @@ declare f1 int; begin return 1; end $$ language plpgsql;
select shadowtest(1);
+-- runtime extra checks
+set plpgsql.extra_warnings to 'too_many_rows';
+
+do $$
+declare x int;
+begin
+ select v from generate_series(1,2) g(v) into x;
+end;
+$$;
+
+set plpgsql.extra_errors to 'too_many_rows';
+
+do $$
+declare x int;
+begin
+ select v from generate_series(1,2) g(v) into x;
+end;
+$$;
+
+reset plpgsql.extra_errors;
+reset plpgsql.extra_warnings;
+
+set plpgsql.extra_warnings to 'strict_multi_assignment';
+
+do $$
+declare
+ x int;
+ y int;
+begin
+ select 1 into x, y;
+ select 1,2 into x, y;
+ select 1,2,3 into x, y;
+end
+$$;
+
+set plpgsql.extra_errors to 'strict_multi_assignment';
+
+do $$
+declare
+ x int;
+ y int;
+begin
+ select 1 into x, y;
+ select 1,2 into x, y;
+ select 1,2,3 into x, y;
+end
+$$;
+
+reset plpgsql.extra_errors;
+reset plpgsql.extra_warnings;
+
-- test scrollable cursor support
create function sc_test() returns setof integer as $$
On 1/1/17 12:17 PM, Pavel Stehule wrote:
I wrote some initial patch
Do you think so has sense to continue in this topic?
Perhaps I'm not understanding what plpgsql_extra_errors does, but I
don't think either of these should depend on that being true. IMO these
two checks should be default to throwing an exception.
I think instead of tying these to extra_*, each GUC should accept a LOG
level.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-02 18:36 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/1/17 12:17 PM, Pavel Stehule wrote:
I wrote some initial patch
Do you think so has sense to continue in this topic?
Perhaps I'm not understanding what plpgsql_extra_errors does, but I don't
think either of these should depend on that being true. IMO these two
checks should be default to throwing an exception.
There are use cases where these patters should be used and has sense like
SELECT (polymorphiccomposite).* INTO c1, c2; -- take first two columns
SELECT xx FROM tab ORDER BY yy INTO target -- more rows not a issue
I understand plpgsql_extra_errors as feature that can be enabled on
developer, test, or preprod environments and can help to identify some
strange places.
I think instead of tying these to extra_*, each GUC should accept a LOG
level.
Why? Why the none, warning, error are not enough? Why are you think so
separate GUC can be better than plpgsql_extra_* ?
The fast setting plpgsql.extra_errors = 'all' can switch to some "safe"
configuration.
The fast setting plpgsql.extra_warnings = 'all' can helps with
identification, but doesn't break production (or doesn't breaks other tests)
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
On 1/2/17 12:06 PM, Pavel Stehule wrote:
SELECT (polymorphiccomposite).* INTO c1, c2; -- take first two columns
SELECT xx FROM tab ORDER BY yy INTO target -- more rows not a issue
I understand plpgsql_extra_errors as feature that can be enabled on
developer, test, or preprod environments and can help to identify some
strange places.
Yes, but the two cases you mentioned above are the "strange" cases, and
you should have to do something extra to allow those, not the other way
around.
I think instead of tying these to extra_*, each GUC should accept a
LOG level.Why? Why the none, warning, error are not enough? Why are you think so
separate GUC can be better than plpgsql_extra_* ?The fast setting plpgsql.extra_errors = 'all' can switch to some "safe"
configuration.
The fast setting plpgsql.extra_warnings = 'all' can helps with
identification, but doesn't break production (or doesn't breaks other tests)
I see two problems with those settings:
1) Neither is enabled by default, so 90% of users have no idea they
exist. Obviously that's an easy enough fix, but...
2) There's no way to incrementally change those values for a single
function. If you've set extra_errors = 'all' globally, a single function
can't say "turn off the too many rows setting for this function".
BTW, while I can see value in being able to change these settings for an
entire function, I think the recommended use should be to only change
them for a specific statement.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-02 20:16 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/2/17 12:06 PM, Pavel Stehule wrote:
SELECT (polymorphiccomposite).* INTO c1, c2; -- take first two columns
SELECT xx FROM tab ORDER BY yy INTO target -- more rows not a issue
I understand plpgsql_extra_errors as feature that can be enabled on
developer, test, or preprod environments and can help to identify some
strange places.Yes, but the two cases you mentioned above are the "strange" cases, and
you should have to do something extra to allow those, not the other way
around.
The second example is really strange. But the first example is used in
composite types conversion - when you convert from base to extend type.
This routine is used in plpgsql when you use a assignment statement
composite_var := another_composite_var
I think instead of tying these to extra_*, each GUC should accept a
LOG level.
Why? Why the none, warning, error are not enough? Why are you think so
separate GUC can be better than plpgsql_extra_* ?The fast setting plpgsql.extra_errors = 'all' can switch to some "safe"
configuration.
The fast setting plpgsql.extra_warnings = 'all' can helps with
identification, but doesn't break production (or doesn't breaks other
tests)I see two problems with those settings:
1) Neither is enabled by default, so 90% of users have no idea they exist.
Obviously that's an easy enough fix, but...
We can strongly talk about it - there can be a chapter in plpgsql doc. Now,
the patterns and antipatterns are not officially documented.
2) There's no way to incrementally change those values for a single
function. If you've set extra_errors = 'all' globally, a single function
can't say "turn off the too many rows setting for this function".
We can enhance the GUC syntax like "all -too_many_rows,-xxx"
BTW, while I can see value in being able to change these settings for an
entire function, I think the recommended use should be to only change them
for a specific statement.
What you can do in plain assign statement
target := expression ?
My border is any compatibility break - and I would not to across it. First
issue is probably harder
related to typo "select 1 x into c1,c2" and it can be detected by
plpgsql_check.
Second issue is not a performance issue today (we read only 2 rows
everytime) and it is hard how often it returns wrong result. This issue
cannot be detected by plpgsql_check now.
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
On Tue, Dec 27, 2016 at 10:09 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 12/27/16 4:56 PM, Merlin Moncure wrote:
On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:First I describe my initial position. I am strongly against introduction
"new" language - plpgsql2 or new plpgsql, or any else. The trust of
developers to us is important and introduction of any not compatible or
different feature has to have really big reason. PostgreSQL is
conservative
environment, and PLpgSQL should not be a exception. More - I have not anyWhich is why this is an external fork of plpgsql.
** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards compatibility
problems. **
Just got back from break :-). Have some thoughts on this. Backwards
compatibility is really a fundamental problem. There's really no
solution to it other than to try and avoid using syntax to solve
problems. It should be obvious to everyone that plgsql cannot
withstand a compatibility break. Another language could be offered as
an alternative in core (say, pl/psm or pl/v8), but pl/pgsql has to
support old code. Some really out there features could maybe be
redacted (in particular, using = for assignment), but not not much.
But I guess we're stuck with the status quo.
I think we ought to avoid language features that influence the
behavior (performance is ok) of the code (and that includes throwing
errors). That's a freight train headed towards javscript's 'strict'
mode, which is thinly disguised language fork. #option and pragma
type syntaxes are trying to cheat the language -- hardly anyone uses
them and it's a tricky way to try and make the language into something
other than it is.
C does it right -- dubious code is raised as warnings and it's up to
the end user to determine which warnings are interesting and likely to
be an error. So, rather than hacking the language to control throwing
and errors and such there should be some ability validate the function
heavily and verify suspicious use of INTO or other dubious things
(unused variables, masked assignments, etc). The validation output
could even be a set returning function.
So -1 to strict mode, unless we can make a case why this can't be done
as part of checking/validation.
Other random points:
*) Another major pain point is swapping in the input variables for
debugging purposes. Something that emits a script based on a set of
arguments would be wonderful.
*) Would also like to have a FINALLY block
*) A mechanic to manually print out a stack trace for debugging
purposes would be helpful.
*) COPY not being able to accept arguments as variables (in particular
the filename) is a major headache
*) Upon error, we ought to print a few lines of context on either side
of the error. Not very fun to try and figure out where the errors is
when you are working off of 'syntax error near "FROM"' etc. This is a
major problem throughout the platform IMO.
*) Some user visible mechanic other than forcing SQL through EXECUTE
to be able to control plan caching would be useful.
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-03 16:23 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
On Tue, Dec 27, 2016 at 10:09 PM, Jim Nasby <Jim.Nasby@bluetreble.com>
wrote:On 12/27/16 4:56 PM, Merlin Moncure wrote:
On Tue, Dec 27, 2016 at 1:54 AM, Pavel Stehule <pavel.stehule@gmail.com
wrote:
First I describe my initial position. I am strongly against
introduction
"new" language - plpgsql2 or new plpgsql, or any else. The trust of
developers to us is important and introduction of any not compatible or
different feature has to have really big reason. PostgreSQL is
conservative
environment, and PLpgSQL should not be a exception. More - I have notany
Which is why this is an external fork of plpgsql.
** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwardscompatibility
problems. **
Just got back from break :-). Have some thoughts on this. Backwards
compatibility is really a fundamental problem. There's really no
solution to it other than to try and avoid using syntax to solve
problems. It should be obvious to everyone that plgsql cannot
withstand a compatibility break. Another language could be offered as
an alternative in core (say, pl/psm or pl/v8), but pl/pgsql has to
support old code. Some really out there features could maybe be
redacted (in particular, using = for assignment), but not not much.
But I guess we're stuck with the status quo.I think we ought to avoid language features that influence the
behavior (performance is ok) of the code (and that includes throwing
errors). That's a freight train headed towards javscript's 'strict'
mode, which is thinly disguised language fork. #option and pragma
type syntaxes are trying to cheat the language -- hardly anyone uses
them and it's a tricky way to try and make the language into something
other than it is.C does it right -- dubious code is raised as warnings and it's up to
the end user to determine which warnings are interesting and likely to
be an error. So, rather than hacking the language to control throwing
and errors and such there should be some ability validate the function
heavily and verify suspicious use of INTO or other dubious things
(unused variables, masked assignments, etc). The validation output
could even be a set returning function.
So -1 to strict mode, unless we can make a case why this can't be done
as part of checking/validation.
Can be plpgsq.extra_errors and plpgsql.extra_warnings solution?
I am thinking so there is a space for improvement (in extra_* usage)
Do you know plpgsql_check https://github.com/okbob/plpgsql_check ?
Other random points:
*) Another major pain point is swapping in the input variables for
debugging purposes. Something that emits a script based on a set of
arguments would be wonderful.
???
*) Would also like to have a FINALLY block
What you can do there?
*) A mechanic to manually print out a stack trace for debugging
purposes would be helpful.
I had plan to develop a extension for this purpose - easy printing stack,
function parameters, and local variables. But I had a motivation to start
it. It can be usable for any PL
*) COPY not being able to accept arguments as variables (in particular
the filename) is a major headache
There is a patch "COPY as function"
*) Upon error, we ought to print a few lines of context on either side
of the error. Not very fun to try and figure out where the errors is
when you are working off of 'syntax error near "FROM"' etc. This is a
major problem throughout the platform IMO.
Have not idea how to fix it
*) Some user visible mechanic other than forcing SQL through EXECUTE
to be able to control plan caching would be useful.
fully agree.
Have you some ideas?
What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any non
trivial plans will not be cached - and evaluated as parametrized query only.
Regards
Pavel
Show quoted text
merlin
On 1/2/17 1:51 PM, Pavel Stehule wrote:
1) Neither is enabled by default, so 90% of users have no idea they
exist. Obviously that's an easy enough fix, but...We can strongly talk about it - there can be a chapter in plpgsql doc.
Now, the patterns and antipatterns are not officially documented.
Or just fix the issue, provide the backwards compatability GUCs and move on.
2) There's no way to incrementally change those values for a single
function. If you've set extra_errors = 'all' globally, a single
function can't say "turn off the too many rows setting for this
function".We can enhance the GUC syntax like "all -too_many_rows,-xxx"
Why create all that framework when we could just have multiple
plpgsql.blah GUCs? plpgsql.multirow_assign_level=FATAL solves that
problem. We just need a plpgsql GUC for each backwards compatibility break.
BTW, while I can see value in being able to change these settings
for an entire function, I think the recommended use should be to
only change them for a specific statement.What you can do in plain assign statement
target := expression ?
The point I was trying to make there is if you do have some cases where
you need to silently ignore extra rows (for example) it's probably only
one statement and not an entire function. That said, if we just make
these options GUCs then you can just do SET and RESET.
My border is any compatibility break - and I would not to across it.
First issue is probably harder
If we never broke compatibility we'd still be allowing SELECT without
FROM, NULL = NULL being TRUE, and a whole bunch of other problems. We'd
also be stuck on protocol v1 (and of course not talking about what we
want in v4).
We've successfully made incompatible changes that were *far worse* than
this (ie: renaming pg_stat_activity.procpid). Obviously we shouldn't be
breaking things willy-nilly, but these are long-standing warts (dare I
say BUGS?) that should be fixed. They're ugly enough that someone took
the time to break plpgsql out of the core code and fork it.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-03 17:57 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/2/17 1:51 PM, Pavel Stehule wrote:
1) Neither is enabled by default, so 90% of users have no idea they
exist. Obviously that's an easy enough fix, but...We can strongly talk about it - there can be a chapter in plpgsql doc.
Now, the patterns and antipatterns are not officially documented.Or just fix the issue, provide the backwards compatability GUCs and move
on.
It is still compatibility break.
2) There's no way to incrementally change those values for a single
function. If you've set extra_errors = 'all' globally, a single
function can't say "turn off the too many rows setting for this
function".We can enhance the GUC syntax like "all -too_many_rows,-xxx"
Why create all that framework when we could just have multiple
plpgsql.blah GUCs? plpgsql.multirow_assign_level=FATAL solves that
problem. We just need a plpgsql GUC for each backwards compatibility break.
We have this framework already, so why don't use it.
BTW, while I can see value in being able to change these settings
for an entire function, I think the recommended use should be to
only change them for a specific statement.What you can do in plain assign statement
target := expression ?
The point I was trying to make there is if you do have some cases where
you need to silently ignore extra rows (for example) it's probably only one
statement and not an entire function. That said, if we just make these
options GUCs then you can just do SET and RESET.My border is any compatibility break - and I would not to across it.
First issue is probably harder
If we never broke compatibility we'd still be allowing SELECT without
FROM, NULL = NULL being TRUE, and a whole bunch of other problems. We'd
also be stuck on protocol v1 (and of course not talking about what we want
in v4).
This was in dark age - how much users of plpgsql was in 2000? Hard to speak
about Postgres as mature software in this era.
We've successfully made incompatible changes that were *far worse* than
this (ie: renaming pg_stat_activity.procpid). Obviously we shouldn't be
breaking things willy-nilly, but these are long-standing warts (dare I say
BUGS?) that should be fixed. They're ugly enough that someone took the time
to break plpgsql out of the core code and fork it.
We are not talk about features that can be simply marked as bugs, so there
is not too much what we should to fix it. We should to help to users to
identify some possible risk places.
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
On 1/3/17 9:58 AM, Pavel Stehule wrote:
** The real problem is that we have no mechanism for allowing a PL's
language/syntax/API to move forward without massive backwards compatibility
problems. **Just got back from break :-). Have some thoughts on this. Backwards
compatibility is really a fundamental problem. There's really no
solution to it other than to try and avoid using syntax to solve
problems. It should be obvious to everyone that plgsql cannot
withstand a compatibility break. Another language could be offered as
I don't think that's obvious at all. We've introduced incompatibility in
the main grammar without problem. You just need a way for people to get
the old behavior back if they need it. Eventually people will stop
relying on the old, broken behavior.
an alternative in core (say, pl/psm or pl/v8), but pl/pgsql has to
support old code. Some really out there features could maybe be
redacted (in particular, using = for assignment), but not not much.
But I guess we're stuck with the status quo.I think we ought to avoid language features that influence the
behavior (performance is ok) of the code (and that includes throwing
errors). That's a freight train headed towards javscript's 'strict'
mode, which is thinly disguised language fork. #option and pragma
type syntaxes are trying to cheat the language -- hardly anyone uses
them and it's a tricky way to try and make the language into something
other than it is.
Yeah, trying to bulk all these changes into one "magic setting" is not a
way to move forward. I think we're actually really well off in that
regard, because unlike most languages we have a very robust settings
system that allows controlling this behavior even at the statement level.
C does it right -- dubious code is raised as warnings and it's up to
the end user to determine which warnings are interesting and likely to
be an error. So, rather than hacking the language to control throwing
and errors and such there should be some ability validate the function
heavily and verify suspicious use of INTO or other dubious things
(unused variables, masked assignments, etc). The validation output
could even be a set returning function.
While static analysis can do some good (and I think we should actually
be enabling more of that by default), it won't realistically solve
everything. Multi-row assignment is a good example: NO ONE is going to
be OK with tons of warnings for every little := or SELECT INTO (without
strict), but the reality is that most code actually won't work correctly
if you have multiple rows coming back, so there's nothing technically
wrong with `var = field FROM table WHERE table_id = plpgsql_variable` if
table_id is the PK: you'll always get 0 or 1 rows back.
So -1 to strict mode, unless we can make a case why this can't be done
as part of checking/validation.Can be plpgsq.extra_errors and plpgsql.extra_warnings solution?
I am thinking so there is a space for improvement (in extra_* usage)
Do you know plpgsql_check https://github.com/okbob/plpgsql_check ?
I think we should look at what parts of that we should pull into core
(as well as enabling more by default). Stuff that can be done at
compile/load time is certainly better than runtime checks.
Other random points:
*) Another major pain point is swapping in the input variables for
debugging purposes. Something that emits a script based on a set of
arguments would be wonderful.???
Yeah, could you elaborate here?
*) Would also like to have a FINALLY block
What you can do there?
It's a block that ALWAYS executes, even if an exception occurs. Python
has this[1]. That (along with an ELSE clause for if there is no
exception) would mean you could catch an exception for a single command
instead of a bunch of commands.
Somewhat related to that, I wish you could make GUC changes that were
local only to a specific BEGIN block. AFAIK the GUC infrastructure fully
supports that, it would just need to be exposed in plpgsql.
*) A mechanic to manually print out a stack trace for debugging
purposes would be helpful.I had plan to develop a extension for this purpose - easy printing
stack, function parameters, and local variables. But I had a motivation
to start it. It can be usable for any PL
I assume you're thinking an SRF that spits out PG_CONTEXT? It'd be
really nice if you could also get things like function names and line
numbers broken out separately. I've thought of building this myself.
BTW, the biggest case I can think of using this for is a userspace
method of doing "private" functions, where the function throws an
exception unless it was called directly by a set of allowed functions
(or views).
*) COPY not being able to accept arguments as variables (in particular
the filename) is a major headacheThere is a patch "COPY as function"
Instead of just COPY, I'd like an easier way to pass identifiers into
SQL commands. format() certainly makes this easier than it was, but %3$s
gets old pretty quick. (Of course, if we had the concept of a dictionary
you could actually name the parameters and it wouldn't be quite so bad...)
*) Upon error, we ought to print a few lines of context on either side
of the error. Not very fun to try and figure out where the errors is
when you are working off of 'syntax error near "FROM"' etc. This is a
major problem throughout the platform IMO.Have not idea how to fix it
To do this I think you'd need to keep an array of preceding line
positions instead of just one, which I don't think would be that hard.
The bigger problem would be only spitting out the extra info on the
initial error, and not for subsequent context calls up the stack. I
don't think there's currently a way to tell if you're the level that the
ereport originated at (at least, not in an error).
Improving that would definitely be useful across the board, because
right now DEBUG becomes completely useless once you have more than 1 or
2 levels of nested calls.
*) Some user visible mechanic other than forcing SQL through EXECUTE
to be able to control plan caching would be useful.fully agree.
Have you some ideas?
What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any
non trivial plans will not be cached - and evaluated as parametrized
query only.
I'd also like the ability to do a "localized" PREPARE; similar to a SQL
level PREPARE statement, but ensuring that the statement got deallocated
when the block ended.
1: https://docs.python.org/3/tutorial/errors.html#defining-clean-up-actions
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/3/17 11:19 AM, Pavel Stehule wrote:
2) There's no way to incrementally change those values for a
single
function. If you've set extra_errors = 'all' globally, a single
function can't say "turn off the too many rows setting for this
function".We can enhance the GUC syntax like "all -too_many_rows,-xxx"
Why create all that framework when we could just have multiple
plpgsql.blah GUCs? plpgsql.multirow_assign_level=FATAL solves that
problem. We just need a plpgsql GUC for each backwards compatibility
break.We have this framework already, so why don't use it.
We *don't* have a framework that works for this, because you can't
incrementally modify extra_errors. Maybe extra_errors is an OK API for
static checking, but it's definitely a BAD API for something you'd need
to control at a function (or even statement) level.
If we never broke compatibility we'd still be allowing SELECT
without FROM, NULL = NULL being TRUE, and a whole bunch of other
problems. We'd also be stuck on protocol v1 (and of course not
talking about what we want in v4).This was in dark age - how much users of plpgsql was in 2000? Hard to
speak about Postgres as mature software in this era.
I don't know about you' but I've considered Postgres to be mature since
at least 8.0, if not earlier. Actually, in many ways it was far more
mature than other databases I was using in 2000 (let alone 2007).
We've successfully made incompatible changes that were *far worse*
than this (ie: renaming pg_stat_activity.procpid). Obviously we
shouldn't be breaking things willy-nilly, but these are
long-standing warts (dare I say BUGS?) that should be fixed. They're
ugly enough that someone took the time to break plpgsql out of the
core code and fork it.We are not talk about features that can be simply marked as bugs, so
there is not too much what we should to fix it. We should to help to
users to identify some possible risk places.
You keep claiming that these aren't serious bugs, yet someone felt so
strongly that they ARE serious bugs that they forked the entire PL.
If you're not willing to even consider a compatibility break (with a
means to get the old behavior back) then I don't think there's any point
in continuing this thread, because some of these issues can NOT be
reasonably solved by a checker.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-03 18:41 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/3/17 11:19 AM, Pavel Stehule wrote:
2) There's no way to incrementally change those values for a
single
function. If you've set extra_errors = 'all' globally, a
single
function can't say "turn off the too many rows setting for
this
function".We can enhance the GUC syntax like "all -too_many_rows,-xxx"
Why create all that framework when we could just have multiple
plpgsql.blah GUCs? plpgsql.multirow_assign_level=FATAL solves that
problem. We just need a plpgsql GUC for each backwards compatibility
break.We have this framework already, so why don't use it.
We *don't* have a framework that works for this, because you can't
incrementally modify extra_errors. Maybe extra_errors is an OK API for
static checking, but it's definitely a BAD API for something you'd need to
control at a function (or even statement) level.
I have different opinion then you - sure - it should not to change behave,
it should to help with identification. And it is enough for this purpose.
If we never broke compatibility we'd still be allowing SELECT
without FROM, NULL = NULL being TRUE, and a whole bunch of other
problems. We'd also be stuck on protocol v1 (and of course not
talking about what we want in v4).This was in dark age - how much users of plpgsql was in 2000? Hard to
speak about Postgres as mature software in this era.I don't know about you' but I've considered Postgres to be mature since at
least 8.0, if not earlier. Actually, in many ways it was far more mature
than other databases I was using in 2000 (let alone 2007).We've successfully made incompatible changes that were *far worse*
than this (ie: renaming pg_stat_activity.procpid). Obviously we
shouldn't be breaking things willy-nilly, but these are
long-standing warts (dare I say BUGS?) that should be fixed. They're
ugly enough that someone took the time to break plpgsql out of the
core code and fork it.We are not talk about features that can be simply marked as bugs, so
there is not too much what we should to fix it. We should to help to
users to identify some possible risk places.You keep claiming that these aren't serious bugs, yet someone felt so
strongly that they ARE serious bugs that they forked the entire PL.
Sorry, but it it is subjective - and there can be different opinions - some
body would to prefer more rigidity, some other less rigidity.
If you're not willing to even consider a compatibility break (with a means
to get the old behavior back) then I don't think there's any point in
continuing this thread, because some of these issues can NOT be reasonably
solved by a checker.
yes, I don't would to consider about a compatibility break. I accept so you
have different opinion.
I'll send this patch + doc to next commitfest - and depends on commiters if
the patch will be rejected or not. I know so it should not be fully fixed,
but it is step forward from my perspective.
Thank you for discussion
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
On Tue, Jan 3, 2017 at 9:58 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
2017-01-03 16:23 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
So -1 to strict mode, unless we can make a case why this can't be done
as part of checking/validation.Can be plpgsq.extra_errors and plpgsql.extra_warnings solution?
I am thinking so there is a space for improvement (in extra_* usage)
extra_warnings seems ok at the GUC level. However it's bad to have a
body of code fail to compile based on GUC. check_function_bodies for
example is a complete hack and should be avoided if at all possible
IMO. There is very good informal rule that GUC should not impact
behavior (minus some special cases like timeouts). Good examples of
failure to follow this rule are mysql and php.
Maybe settings at level of extension could be ok, but I'm skeptical.
Good languages are clear without needing extra context.
Do you know plpgsql_check https://github.com/okbob/plpgsql_check ?
Yes. This is good design and should be model for core-work (if any).
In my ideal world, this could would be part of pgxn and to have pgxn
client be installed in core. For plpgsql to enter modern era we need
standardized packaging and deployment like cran, npm, etc.
Other random points:
*) Another major pain point is swapping in the input variables for
debugging purposes. Something that emits a script based on a set of
arguments would be wonderful.???
Often for debugging of complicated cases I'm starting from errors in
database log with function name and argument values. Sometimes I find
myself pasting pl/pgsql function into text editor and replacing input
variables with known values.
*) Would also like to have a FINALLY block
What you can do there?
This is syntax sugar so you don't need second begin/end/exception
block or duplicated code. It separates error handling from cleanup.
BEGIN
PERFORM dblink_connect(...
<risky_stuff>
EXCEPTION WHEN OTHERS THEN
<log/handle error>
FINALLY
PERFORM dblink_disconnect(...
END;
*) Some user visible mechanic other than forcing SQL through EXECUTE
to be able to control plan caching would be useful.fully agree.
Have you some ideas?
What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any non
trivial plans will not be cached - and evaluated as parametrized query only.
I have slight preference for syntax marker for each query, similar to
INTO. Maybe 'UNCACHED'?
On Tue, Jan 3, 2017 at 10:57 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
Or just fix the issue, provide the backwards compatability GUCs and move on.
I really don't think this will fly. I'm not buying your argument (at
all) that compatibility breaks have have been cleanly done in the
past, at least not in the modern era. In any event, marginal language
improvements are not a good justification to do it. And yes, the
continual monkey around with column names in pg_stat_activity are a
major hassle. For heaven's sake, can we just add new columns and/or
create a new view?
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-03 20:54 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
On Tue, Jan 3, 2017 at 9:58 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:2017-01-03 16:23 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
So -1 to strict mode, unless we can make a case why this can't be done
as part of checking/validation.Can be plpgsq.extra_errors and plpgsql.extra_warnings solution?
I am thinking so there is a space for improvement (in extra_* usage)
extra_warnings seems ok at the GUC level. However it's bad to have a
body of code fail to compile based on GUC. check_function_bodies for
example is a complete hack and should be avoided if at all possible
IMO. There is very good informal rule that GUC should not impact
behavior (minus some special cases like timeouts). Good examples of
failure to follow this rule are mysql and php.Maybe settings at level of extension could be ok, but I'm skeptical.
Good languages are clear without needing extra context.Do you know plpgsql_check https://github.com/okbob/plpgsql_check ?
Yes. This is good design and should be model for core-work (if any).
In my ideal world, this could would be part of pgxn and to have pgxn
client be installed in core. For plpgsql to enter modern era we need
standardized packaging and deployment like cran, npm, etc.Other random points:
*) Another major pain point is swapping in the input variables for
debugging purposes. Something that emits a script based on a set of
arguments would be wonderful.???
Often for debugging of complicated cases I'm starting from errors in
database log with function name and argument values. Sometimes I find
myself pasting pl/pgsql function into text editor and replacing input
variables with known values.
is it related to plpgsql debugger? Have not idea how it can be better on
language level.
*) Would also like to have a FINALLY block
What you can do there?
This is syntax sugar so you don't need second begin/end/exception
block or duplicated code. It separates error handling from cleanup.BEGIN
PERFORM dblink_connect(...
<risky_stuff>
EXCEPTION WHEN OTHERS THEN
<log/handle error>
FINALLY
PERFORM dblink_disconnect(...
END;
Does know somebody this pattern from Ada or PL/SQL?
*) Some user visible mechanic other than forcing SQL through EXECUTE
to be able to control plan caching would be useful.fully agree.
Have you some ideas?
What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any
non
trivial plans will not be cached - and evaluated as parametrized query
only.
I have slight preference for syntax marker for each query, similar to
INTO. Maybe 'UNCACHED'?
I am not clean opinion - the statement level is nice, but what readability?
SELECT UNCACHED t.a, t.b FROM INTO a,b;
Regards
Pavel
Show quoted text
On Tue, Jan 3, 2017 at 10:57 AM, Jim Nasby <Jim.Nasby@bluetreble.com>
wrote:Or just fix the issue, provide the backwards compatability GUCs and move
on.
I really don't think this will fly. I'm not buying your argument (at
all) that compatibility breaks have have been cleanly done in the
past, at least not in the modern era. In any event, marginal language
improvements are not a good justification to do it. And yes, the
continual monkey around with column names in pg_stat_activity are a
major hassle. For heaven's sake, can we just add new columns and/or
create a new view?merlin
On Tue, Jan 3, 2017 at 2:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
2017-01-03 20:54 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
On Tue, Jan 3, 2017 at 9:58 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:2017-01-03 16:23 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
*) Would also like to have a FINALLY block
What you can do there?
This is syntax sugar so you don't need second begin/end/exception
block or duplicated code. It separates error handling from cleanup.BEGIN
PERFORM dblink_connect(...
<risky_stuff>
EXCEPTION WHEN OTHERS THEN
<log/handle error>
FINALLY
PERFORM dblink_disconnect(...
END;Does know somebody this pattern from Ada or PL/SQL?
I guess probably not. It's a standard pattern in modern EH languages
(for example, https://msdn.microsoft.com/en-us/library/dszsf989.aspx).
*) Some user visible mechanic other than forcing SQL through EXECUTE
to be able to control plan caching would be useful.fully agree.
Have you some ideas?
What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any
non
trivial plans will not be cached - and evaluated as parametrized query
only.I have slight preference for syntax marker for each query, similar to
INTO. Maybe 'UNCACHED'?I am not clean opinion - the statement level is nice, but what readability?
SELECT UNCACHED t.a, t.b FROM INTO a,b;
Yeah -- this is pretty ugly admittedly. Maybe control directive is
ok, as long as you can set it mid function?
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-04 16:49 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
On Tue, Jan 3, 2017 at 2:15 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:2017-01-03 20:54 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
On Tue, Jan 3, 2017 at 9:58 AM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:2017-01-03 16:23 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
*) Would also like to have a FINALLY block
What you can do there?
This is syntax sugar so you don't need second begin/end/exception
block or duplicated code. It separates error handling from cleanup.BEGIN
PERFORM dblink_connect(...
<risky_stuff>
EXCEPTION WHEN OTHERS THEN
<log/handle error>
FINALLY
PERFORM dblink_disconnect(...
END;Does know somebody this pattern from Ada or PL/SQL?
I guess probably not. It's a standard pattern in modern EH languages
(for example, https://msdn.microsoft.com/en-us/library/dszsf989.aspx).*) Some user visible mechanic other than forcing SQL through EXECUTE
to be able to control plan caching would be useful.fully agree.
Have you some ideas?
What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any
non
trivial plans will not be cached - and evaluated as parametrized query
only.I have slight preference for syntax marker for each query, similar to
INTO. Maybe 'UNCACHED'?I am not clean opinion - the statement level is nice, but what
readability?
SELECT UNCACHED t.a, t.b FROM INTO a,b;
Yeah -- this is pretty ugly admittedly. Maybe control directive is
ok, as long as you can set it mid function?
ADA uses for this purpose PRAGMA keyword - it is used for everything in ADA
- cycle iteration optimization, ...the scope can be statement, block,
procedure.
so something like
BEGIN
PRAGMA uncached_plans;
SELECT ...
..
END;
But it should be verified by some PL/SQL or Ada experts
Regards
Show quoted text
merlin
SELECT UNCACHED t.a, t.b FROM INTO a,b;
Yeah -- this is pretty ugly admittedly. Maybe control directive is
ok, as long as you can set it mid function?ADA uses for this purpose PRAGMA keyword - it is used for everything in
ADA - cycle iteration optimization, ...the scope can be statement, block,
procedure.
Pragma is used for changing (enforcing) behave. There are pragmas ada_05,
ada_2012, ..
Show quoted text
so something like
BEGIN
PRAGMA uncached_plans;
SELECT ...
..
END;But it should be verified by some PL/SQL or Ada experts
Regards
merlin
*) Some user visible mechanic other than forcing SQL through EXECUTE
to be able to control plan caching would be useful.fully agree.
Have you some ideas?
What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE -
any
non
trivial plans will not be cached - and evaluated as parametrizedquery
only.
I have slight preference for syntax marker for each query, similar to
INTO. Maybe 'UNCACHED'?I am not clean opinion - the statement level is nice, but what
readability?
SELECT UNCACHED t.a, t.b FROM INTO a,b;
Yeah -- this is pretty ugly admittedly. Maybe control directive is
ok, as long as you can set it mid function?ADA uses for this purpose PRAGMA keyword - it is used for everything in
ADA - cycle iteration optimization, ...the scope can be statement, block,
procedure.so something like
BEGIN
PRAGMA uncached_plans;
SELECT ...
..
END;But it should be verified by some PL/SQL or Ada experts
Little bit better - if PRAGMA is used in DECLARE part, then it has block
scope
so some possible design can be:
DECLARE
PRAGMA UNCACHED_PLANS;
BEGIN
SELECT ... INTO ;
SELECT ... INTO ;
END;
This respects Ada and PL/SQL style - probably easy implementation
Regards
Pavel
Show quoted text
Regards
merlin
so some possible design can be:
DECLARE
PRAGMA UNCACHED_PLANS;
BEGIN
SELECT ... INTO ;
SELECT ... INTO ;
END;This respects Ada and PL/SQL style - probably easy implementation
Regards
Pavel
some examples based on Ada doc
FUNCTION xxx RETURN int AS
PRAGMA yyy -- pragma has function scope
BEGIN
FUNCTION xxx RETURN int AS
BEGIN
DECLARE
PRAGMA yyy -- pragma has block scope
Regards
Pavel
Show quoted text
Regards
merlin
On Wed, Jan 4, 2017 at 1:49 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
so some possible design can be:
DECLARE
PRAGMA UNCACHED_PLANS;
BEGIN
SELECT ... INTO ;
SELECT ... INTO ;
END;This respects Ada and PL/SQL style - probably easy implementation
Regards
Pavel
some examples based on Ada doc
FUNCTION xxx RETURN int AS
PRAGMA yyy -- pragma has function scope
BEGINFUNCTION xxx RETURN int AS
BEGIN
DECLARE
PRAGMA yyy -- pragma has block scope
ok, sub-block makes sense over statement level IMO.
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Dec 28, 2016 at 2:25 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
That's my whole point of why this needs to be settable at a global level: so
that people with a lot of legacy code can set the OLD behavior at a global
level, and deal with the old code over time.
This has the same problem being discussed nearby on the case-folding
thread, though: any extension or third-party tool has to either work
with every possible value, or else it has to require one particular
value and therefore not be usable if you need another value for some
other reason.
Now, that's not to say we should never break backward compatibility.
Sometimes we should. I think the problem with PL/pgsql is that many
of the compatibility breaks that people want are likely to lead to
subtle misbehavior rather than outright failure, or are not easy to
spot via a cursory look ("hmm, could that SELECT query ever return
more than one row?"). Also, while everybody agrees that a bunch of
things should be changed and improved, not everybody agrees about
which ones, and sometimes person A desperately wants X changed while
person B desperately wants it changed in the other direction or left
alone. If there were a set of changes that we could make all at once,
call the result plpgsql2 or nplpgsql or whatever, and make everybody
happy, that'd be fabulous, but we don't. So we're left with doing
nothing, or having 2^n language variants controlled by GUCs or
pragmas, neither of which is appealing.
I think it would be a good idea to lock all the people who really care
about PL/pgsql in a room until they agree on what changes should be
made for the next version of the language. If they don't agree
quickly enough, we can resort to the techniques described in
https://en.wikipedia.org/wiki/Papal_election,_1268%E2%80%9371
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jan 5, 2017 at 11:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
Now, that's not to say we should never break backward compatibility.
Sometimes we should. I think the problem with PL/pgsql is that many
of the compatibility breaks that people want are likely to lead to
subtle misbehavior rather than outright failure, or are not easy to
spot via a cursory look ("hmm, could that SELECT query ever return
more than one row?").
The core issue is that developers tend to be very poor at estimating
the impacts of changes; they look at things the the lens of the "new".
Professional software development is quite expensive and framework-
(I'll lump the database and it's various built-in features under that
term) level changes are essentially throwing out some portion of our
user's investments. Even fairly innocent compatibility breaks can
have major downstream impacts on our users and it's always much worse
than expected. For example, nobody thought that changing the bytea
text encoding format to hex would have corrupted our user's data, but
it did.
TBH, the discussion should shift away from specific issues on
compatibility and towards a specific set of standards and policies
around how to do it and what kinds of technical justifications need to
be made in advance. Security problems for example could be argued as
a valid reason to break user code, or poor adherence to the the SQL
standard which are in turn blocking other content. Minus those kinds
of considerations it's really just not worth doing, and there's no
tricky strategy like playing with version numbers that can game that
rule. A formal deprecation policy might be a good start.
The C language really should be considered the gold standard here.
Changes did have to be made, like getting rid of the notoriously
broken and insecure gets(), but they were made very, very slowly and
unobtrusively.
(I do think lpad should except "any" FWIW) :-D
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-05 18:36 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
On Thu, Jan 5, 2017 at 11:03 AM, Robert Haas <robertmhaas@gmail.com>
wrote:Now, that's not to say we should never break backward compatibility.
Sometimes we should. I think the problem with PL/pgsql is that many
of the compatibility breaks that people want are likely to lead to
subtle misbehavior rather than outright failure, or are not easy to
spot via a cursory look ("hmm, could that SELECT query ever return
more than one row?").The core issue is that developers tend to be very poor at estimating
the impacts of changes; they look at things the the lens of the "new".
Professional software development is quite expensive and framework-
(I'll lump the database and it's various built-in features under that
term) level changes are essentially throwing out some portion of our
user's investments. Even fairly innocent compatibility breaks can
have major downstream impacts on our users and it's always much worse
than expected. For example, nobody thought that changing the bytea
text encoding format to hex would have corrupted our user's data, but
it did.TBH, the discussion should shift away from specific issues on
compatibility and towards a specific set of standards and policies
around how to do it and what kinds of technical justifications need to
be made in advance. Security problems for example could be argued as
a valid reason to break user code, or poor adherence to the the SQL
standard which are in turn blocking other content. Minus those kinds
of considerations it's really just not worth doing, and there's no
tricky strategy like playing with version numbers that can game that
rule. A formal deprecation policy might be a good start.The C language really should be considered the gold standard here.
Changes did have to be made, like getting rid of the notoriously
broken and insecure gets(), but they were made very, very slowly and
unobtrusively.(I do think lpad should except "any" FWIW) :-D
I fully agree - sometimes there is fuzzy border in understanding what is
bug and what unhappy designed feature - probably lot of useful features can
be taken and used wrong.
Regards
Pavel
Show quoted text
merlin
Hi
some examples based on Ada doc
FUNCTION xxx RETURN int AS
PRAGMA yyy -- pragma has function scope
BEGINFUNCTION xxx RETURN int AS
BEGIN
DECLARE
PRAGMA yyy -- pragma has block scopeok, sub-block makes sense over statement level IMO.
I am sending proof concept (parser only implementation) - it allows to
control query plan usage on function and on block level
Examples
CREATE OR REPLACE FUNCTION fx()
RETURNS int AS $$
PRAGMA use_query_plan_cache(off); -- disable query plan cache on function
level
DECLARE r record;
BEGIN
FOR r IN SELECT ... -- some complex query, where we prefer on one shot
plan
LOOP
DECLARE
PRAGMA use_query_plan_cache(on); -- enable query plan cache for block
BEGIN
... statements inside cycle reuses query plan
END;
END LOOP;
END;
$$ LANGUAGE plpgsql;
or
BEGIN
...
DECLARE
PRAGMA use_query_plan_cache(off);
BEGIN
-- these queries has fresh plan only
SELECT ...
SELECT ...
END; -- end of PRAGMA scope
...
-- usual behave
END;
The behave is static - controlled on compile time only - the controlled
feature can be enabled/disabled. The impact on runtime is zero
* the syntax is verbose - readable - I prefer strong clean signal for
readers so something internals is different
* consistent with Ada, PL/SQL
* remove one reason for dynamic SQL
* allows to mix queries with without query plan cache - interesting for
patter FOR IN slow query LOOP fast query; END LOOP;
* there is small risk of compatibility break - if somebody use variables
named PRAGMA, because new reserved keyword is necessary - fails on syntax
error - so it is easy identified.
* this syntax can be reused - autonomous_transaction like PL/SQL. I read a
manual of Gnu Ada - and this is used often for implementation legacy
(obsolete) behave, functionality.
Notes, comments?
Regards
Pavel
Attachments:
pragma_query_plan_cache.patchtext/x-patch; charset=US-ASCII; name=pragma_query_plan_cache.patchDownload
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 3c52d71..a5fd040 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -288,6 +288,7 @@ do_compile(FunctionCallInfo fcinfo,
int *in_arg_varnos = NULL;
PLpgSQL_variable **out_arg_variables;
MemoryContext func_cxt;
+ PLpgSQL_settings settings;
/*
* Setup the scanner input and error info. We assume that this function
@@ -373,6 +374,11 @@ do_compile(FunctionCallInfo fcinfo,
plpgsql_DumpExecTree = false;
plpgsql_start_datums();
+ /* Prepare default for PRAGMA directives */
+ settings.prev = NULL;
+ settings.use_query_plan_cache = true;
+ plpgsql_settings_init(&settings);
+
switch (function->fn_is_trigger)
{
case PLPGSQL_NOT_TRIGGER:
@@ -796,6 +802,7 @@ plpgsql_compile_inline(char *proc_source)
PLpgSQL_variable *var;
int parse_rc;
MemoryContext func_cxt;
+ PLpgSQL_settings settings;
/*
* Setup the scanner input and error info. We assume that this function
@@ -851,6 +858,11 @@ plpgsql_compile_inline(char *proc_source)
plpgsql_DumpExecTree = false;
plpgsql_start_datums();
+ /* Prepare default for PRAGMA directives */
+ settings.prev = NULL;
+ settings.use_query_plan_cache = true;
+ plpgsql_settings_init(&settings);
+
/* Set up as though in a function returning VOID */
function->fn_rettype = VOIDOID;
function->fn_retset = false;
diff --git a/src/pl/plpgsql/src/pl_funcs.c b/src/pl/plpgsql/src/pl_funcs.c
index 906fe01..c7ee968 100644
--- a/src/pl/plpgsql/src/pl_funcs.c
+++ b/src/pl/plpgsql/src/pl_funcs.c
@@ -33,6 +33,7 @@
* ----------
*/
static PLpgSQL_nsitem *ns_top = NULL;
+static PLpgSQL_settings *settings_top = NULL;
/* ----------
@@ -226,6 +227,66 @@ plpgsql_ns_find_nearest_loop(PLpgSQL_nsitem *ns_cur)
/*
+ * Compilator settings routines
+ */
+
+void
+plpgsql_settings_init(PLpgSQL_settings *defval)
+{
+ settings_top = defval;
+}
+
+/*
+ * Creates new settings based on previous settings
+ */
+void
+plpgsql_settings_clone(void)
+{
+ PLpgSQL_settings *new = palloc(sizeof(PLpgSQL_settings));
+
+ Assert(settings_top != NULL);
+
+ memcpy(new, settings_top, sizeof(PLpgSQL_settings));
+ new->prev = settings_top;
+ settings_top = new;
+}
+
+/*
+ * apply a pragma to current settings
+ */
+void
+plpgsql_settings_pragma(PLpgSQL_pragma_type typ, bool value)
+{
+ Assert(settings_top != NULL);
+
+ switch (typ)
+ {
+ case PLPGSQL_PRAGMA_QUERY_PLAN_CACHE:
+ settings_top->use_query_plan_cache = value;
+ }
+}
+
+/*
+ * restore previous compiler settings
+ */
+void
+plpgsql_settings_pop(void)
+{
+ PLpgSQL_settings *prev;
+
+ Assert(settings_top != NULL);
+ prev = settings_top->prev;
+ pfree(settings_top);
+ settings_top = prev;
+}
+
+PLpgSQL_settings *
+plpgsql_settings_top(void)
+{
+ return settings_top;
+}
+
+/*
* Statement type as a string, for use in error messages etc.
*/
const char *
@@ -1534,7 +1595,7 @@ dump_getdiag(PLpgSQL_stmt_getdiag *stmt)
static void
dump_expr(PLpgSQL_expr *expr)
{
- printf("'%s'", expr->query);
+ printf("%s'%s'", expr->use_query_plan_cache ? "*" : "", expr->query);
}
void
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 4a4cd6a..2e8287b 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -216,6 +216,8 @@ static void check_raise_parameters(PLpgSQL_stmt_raise *stmt);
%type <ival> opt_scrollable
%type <fetch> opt_fetch_direction
+%type <boolean> onoff
+
%type <keyword> unreserved_keyword
@@ -306,6 +308,8 @@ static void check_raise_parameters(PLpgSQL_stmt_raise *stmt);
%token <keyword> K_NOT
%token <keyword> K_NOTICE
%token <keyword> K_NULL
+%token <keyword> K_OFF
+%token <keyword> K_ON
%token <keyword> K_OPEN
%token <keyword> K_OPTION
%token <keyword> K_OR
@@ -315,9 +319,11 @@ static void check_raise_parameters(PLpgSQL_stmt_raise *stmt);
%token <keyword> K_PG_EXCEPTION_CONTEXT
%token <keyword> K_PG_EXCEPTION_DETAIL
%token <keyword> K_PG_EXCEPTION_HINT
+%token <keyword> K_PRAGMA
%token <keyword> K_PRINT_STRICT_PARAMS
%token <keyword> K_PRIOR
%token <keyword> K_QUERY
+%token <keyword> K_QUERY_PLAN_CACHE
%token <keyword> K_RAISE
%token <keyword> K_RELATIVE
%token <keyword> K_RESULT_OID
@@ -348,9 +354,9 @@ static void check_raise_parameters(PLpgSQL_stmt_raise *stmt);
%%
-pl_function : comp_options pl_block opt_semi
+pl_function : comp_options pragmas_opt pl_block opt_semi
{
- plpgsql_parse_result = (PLpgSQL_stmt_block *) $2;
+ plpgsql_parse_result = (PLpgSQL_stmt_block *) $3;
}
;
@@ -398,6 +404,30 @@ opt_semi :
| ';'
;
+onoff : K_ON
+ {
+ $$ = true;
+ }
+ | K_OFF
+ {
+ $$ = false;
+ }
+ ;
+
+pragma : K_PRAGMA K_QUERY_PLAN_CACHE '(' onoff ')' ';'
+ {
+ plpgsql_settings_pragma(PLPGSQL_PRAGMA_QUERY_PLAN_CACHE, $4);
+ }
+ ;
+
+pragmas : pragmas pragma
+ | pragma
+ ;
+
+pragmas_opt :
+ | pragmas
+ ;
+
pl_block : decl_sect K_BEGIN proc_sect exception_sect K_END opt_label
{
PLpgSQL_stmt_block *new;
@@ -414,6 +444,7 @@ pl_block : decl_sect K_BEGIN proc_sect exception_sect K_END opt_label
check_labels($1.label, $6, @6);
plpgsql_ns_pop();
+ plpgsql_settings_pop();
$$ = (PLpgSQL_stmt *)new;
}
@@ -448,6 +479,10 @@ decl_start : K_DECLARE
{
/* Forget any variables created before block */
plpgsql_add_initdatums(NULL);
+
+ /* clone compiler settings */
+ plpgsql_settings_clone();
+
/*
* Disable scanner lookup of identifiers while
* we process the decl_stmts
@@ -476,6 +511,7 @@ decl_stmt : decl_statement
errmsg("block label must be placed before DECLARE, not after"),
parser_errposition(@1)));
}
+ | pragma
;
decl_statement : decl_varname decl_const decl_datatype decl_collate decl_notnull decl_defval
@@ -579,6 +615,7 @@ decl_statement : decl_varname decl_const decl_datatype decl_collate decl_notnull
}
strcpy(cp2, "'::pg_catalog.refcursor");
curname_def->query = pstrdup(buf);
+ curname_def->use_query_plan_cache = plpgsql_settings_top()->use_query_plan_cache;
new->default_val = curname_def;
new->cursor_explicit_expr = $7;
@@ -2442,6 +2479,8 @@ unreserved_keyword :
| K_NEXT
| K_NO
| K_NOTICE
+ | K_OFF
+ | K_ON
| K_OPEN
| K_OPTION
| K_PERFORM
@@ -2453,6 +2492,7 @@ unreserved_keyword :
| K_PRINT_STRICT_PARAMS
| K_PRIOR
| K_QUERY
+ | K_QUERY_PLAN_CACHE
| K_RAISE
| K_RELATIVE
| K_RESULT_OID
@@ -2689,6 +2729,7 @@ read_sql_construct(int until,
expr->paramnos = NULL;
expr->rwparam = -1;
expr->ns = plpgsql_ns_top();
+ expr->use_query_plan_cache = plpgsql_settings_top()->use_query_plan_cache;
pfree(ds.data);
if (valid_sql)
@@ -2937,6 +2978,7 @@ make_execsql_stmt(int firsttoken, int location)
expr->paramnos = NULL;
expr->rwparam = -1;
expr->ns = plpgsql_ns_top();
+ expr->use_query_plan_cache = plpgsql_settings_top()->use_query_plan_cache;
pfree(ds.data);
check_sql_expr(expr->query, location, 0);
@@ -3821,6 +3863,7 @@ read_cursor_args(PLpgSQL_var *cursor, int until, const char *expected)
expr->paramnos = NULL;
expr->rwparam = -1;
expr->ns = plpgsql_ns_top();
+ expr->use_query_plan_cache = plpgsql_settings_top()->use_query_plan_cache;
pfree(ds.data);
/* Next we'd better find the until token */
diff --git a/src/pl/plpgsql/src/pl_scanner.c b/src/pl/plpgsql/src/pl_scanner.c
index c401213..cf5bd39 100644
--- a/src/pl/plpgsql/src/pl_scanner.c
+++ b/src/pl/plpgsql/src/pl_scanner.c
@@ -84,6 +84,7 @@ static const ScanKeyword reserved_keywords[] = {
PG_KEYWORD("not", K_NOT, RESERVED_KEYWORD)
PG_KEYWORD("null", K_NULL, RESERVED_KEYWORD)
PG_KEYWORD("or", K_OR, RESERVED_KEYWORD)
+ PG_KEYWORD("pragma", K_PRAGMA, RESERVED_KEYWORD)
PG_KEYWORD("strict", K_STRICT, RESERVED_KEYWORD)
PG_KEYWORD("then", K_THEN, RESERVED_KEYWORD)
PG_KEYWORD("to", K_TO, RESERVED_KEYWORD)
@@ -139,6 +140,8 @@ static const ScanKeyword unreserved_keywords[] = {
PG_KEYWORD("next", K_NEXT, UNRESERVED_KEYWORD)
PG_KEYWORD("no", K_NO, UNRESERVED_KEYWORD)
PG_KEYWORD("notice", K_NOTICE, UNRESERVED_KEYWORD)
+ PG_KEYWORD("off", K_OFF, UNRESERVED_KEYWORD)
+ PG_KEYWORD("on", K_ON, UNRESERVED_KEYWORD)
PG_KEYWORD("open", K_OPEN, UNRESERVED_KEYWORD)
PG_KEYWORD("option", K_OPTION, UNRESERVED_KEYWORD)
PG_KEYWORD("perform", K_PERFORM, UNRESERVED_KEYWORD)
@@ -150,6 +153,7 @@ static const ScanKeyword unreserved_keywords[] = {
PG_KEYWORD("print_strict_params", K_PRINT_STRICT_PARAMS, UNRESERVED_KEYWORD)
PG_KEYWORD("prior", K_PRIOR, UNRESERVED_KEYWORD)
PG_KEYWORD("query", K_QUERY, UNRESERVED_KEYWORD)
+ PG_KEYWORD("query_plan_cache", K_QUERY_PLAN_CACHE, UNRESERVED_KEYWORD)
PG_KEYWORD("raise", K_RAISE, UNRESERVED_KEYWORD)
PG_KEYWORD("relative", K_RELATIVE, UNRESERVED_KEYWORD)
PG_KEYWORD("result_oid", K_RESULT_OID, UNRESERVED_KEYWORD)
diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index 3421eed..e3c4435 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -35,6 +35,14 @@
#define _(x) dgettext(TEXTDOMAIN, x)
/*
+ * Compiler directives
+ */
+typedef enum PLpgSQL_pragma_type
+{
+ PLPGSQL_PRAGMA_QUERY_PLAN_CACHE
+} PLpgSQL_pragma_type;
+
+/*
* Compiler's namespace item types
*/
typedef enum PLpgSQL_nsitem_type
@@ -224,6 +232,7 @@ typedef struct PLpgSQL_expr
int dno;
char *query;
SPIPlanPtr plan;
+ bool use_query_plan_cache;
Bitmapset *paramnos; /* all dnos referenced by this query */
int rwparam; /* dno of read/write param, or -1 if none */
@@ -1004,6 +1013,15 @@ typedef struct PLwdatum
List *idents; /* valid if composite name */
} PLwdatum;
+/*
+ * Compiler directives
+ */
+typedef struct PLpgSQL_settings
+{
+ struct PLpgSQL_settings *prev;
+ bool use_query_plan_cache;
+} PLpgSQL_settings;
+
/**********************************************************************
* Global variable declarations
**********************************************************************/
@@ -1123,6 +1141,15 @@ extern PLpgSQL_nsitem *plpgsql_ns_lookup_label(PLpgSQL_nsitem *ns_cur,
extern PLpgSQL_nsitem *plpgsql_ns_find_nearest_loop(PLpgSQL_nsitem *ns_cur);
/*
+ * Function for compiler directives processing in pl_func.c
+ */
+extern void plpgsql_settings_init(PLpgSQL_settings *defval);
+extern void plpgsql_settings_clone(void);
+extern void plpgsql_settings_pragma(PLpgSQL_pragma_type typ, bool value);
+extern PLpgSQL_settings *plpgsql_settings_top(void);
+extern void plpgsql_settings_pop(void);
+
+/*
* Other functions in pl_funcs.c
*/
extern const char *plpgsql_stmt_typename(PLpgSQL_stmt *stmt);
On Thu, Jan 5, 2017 at 7:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
I think it would be a good idea to lock all the people who really care
about PL/pgsql in a room until they agree on what changes should be
made for the next version of the language. If they don't agree
quickly enough, we can resort to the techniques described in
https://en.wikipedia.org/wiki/Papal_election,_1268%E2%80%9371
I think that's a very good idea, and I'm happy to be locked into such a room.
I think such a discussion will be very fruitful,
given the others in the room have also
already decided they want a new language
and are there to discuss "the next version of the language",
instead of debating why they don't think we need a new language.
It would also be good if those people could bring laptops
with all their plpgsql code bases, to check if any of
the proposed possibly non-backwards compatible
syntax proposals would break nothing, just a few functions,
or a lot of functions in their code bases.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-03 17:57 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/2/17 1:51 PM, Pavel Stehule wrote:
1) Neither is enabled by default, so 90% of users have no idea they
exist. Obviously that's an easy enough fix, but...We can strongly talk about it - there can be a chapter in plpgsql doc.
Now, the patterns and antipatterns are not officially documented.Or just fix the issue, provide the backwards compatability GUCs and move
on.2) There's no way to incrementally change those values for a single
function. If you've set extra_errors = 'all' globally, a single
function can't say "turn off the too many rows setting for this
function".We can enhance the GUC syntax like "all -too_many_rows,-xxx"
Why create all that framework when we could just have multiple
plpgsql.blah GUCs? plpgsql.multirow_assign_level=FATAL solves that
problem. We just need a plpgsql GUC for each backwards compatibility break.BTW, while I can see value in being able to change these settings
for an entire function, I think the recommended use should be to
only change them for a specific statement.What you can do in plain assign statement
target := expression ?
The point I was trying to make there is if you do have some cases where
you need to silently ignore extra rows (for example) it's probably only one
statement and not an entire function. That said, if we just make these
options GUCs then you can just do SET and RESET.My border is any compatibility break - and I would not to across it.
First issue is probably harder
If we never broke compatibility we'd still be allowing SELECT without
FROM, NULL = NULL being TRUE, and a whole bunch of other problems. We'd
also be stuck on protocol v1 (and of course not talking about what we want
in v4).We've successfully made incompatible changes that were *far worse* than
this (ie: renaming pg_stat_activity.procpid). Obviously we shouldn't be
breaking things willy-nilly, but these are long-standing warts (dare I say
BUGS?) that should be fixed. They're ugly enough that someone took the time
to break plpgsql out of the core code and fork it.
The discussion about changing behave of current features has not a
solution. I don't believe so there is possible to find a win/win solution -
it is not possible. I respect the opinion all people here, but somebody
more afraid of language fragmentation, somebody else more from some
possible production issues. All arguments are valid, all arguments has same
value, all arguments are sent by people with lot of experience (but with
different experience - anybody works with different domains, uses different
patters, hits different kind of errors).
This discussion was +/- about behave of INTO clause - and if STRICT clause
should be default and if STRICT clause should be more strict.
if we introduce new pattern (new syntax), that can be strict enough then we
can go forward. New syntax will not have a impact on current customers code
base. If somebody prefer very strict behave, then he can use new syntax
quickly. Issues in old code can be detected by other tools - plpgsql_check,
and extra_warnings, extra_errors.
Current state
==========
INTO
-------
* is not strict in any directions - columns or rows
* not equal target and source column number cannot be detected in plpgsql
(plpgsql_check does it)
* missing rows - uses FOUND (PL/SQL), too much rows - uses GET DIAGNOSTICS
x = ROW_COUNT (ANSI/SQL pattern)
* doesn't need outer subtransaction (handled exception) for handling
specific situations
select a into t from test where a = i;
get diagnostics r = row_count;
if r > 1 then raise exception 'too much rows'; end if;
INTO STRICT
------------------
* is strict in rows - require exactly one row result
* not equal target and source column number cannot be detected in plpgsql
(plpgsql_check does it)
* raises a exceptions no_data_found, too_much_rows
* require outer subtransaction (handled exception) for handling
no_data_found (10% slowdown in worst case)
begin
select a into t from test where a = i;
exception
when no_data_found then
t = null;
end;
There is safe pattern (workaround) for missing columns in source - using
RECORD type. Access to missing field raises runtime error. Another solution
- plpgsql_check.
subselect assignment
-----------------------------
* is column strict - doesn't allow more columns now
* don't allow too_much_rows
* the FOUND is always true - has not change to check it
* although it is ANSI/SQL pattern it is undocumented in PostgreSQL (only
INTO clause is documented)
x := (select a from test where a = i)
officially unsupported implementation side effect assignment FROM
------------------------------------------------------------------------------------------
* is undocumented
* allows only one column
* behave is similar like INTO
x := a from test where a = i
Possible risk
============
* typo SELECT a b INTO a,b or SELECT a,b INTO b,a - currently I use a
RECORD type to be safe
* undetermined result in more rows
* unknown result in no_data_found
The questions - what we expect from new behave?
=======================================
* safety against typo - 100% yes
* safety against undetermined result - 100% yes
* what behave should be default in no_data_found case ? Exception or NULL?
- both has advantages for who require designed behave and disadvantage for
require second behave (a) handling the exception, b) IF NOT FOUND test)
Possible solution (without compatibility break)
===================================
introduction ANSI/SQL assingnment, multiassignment SET
------------------------------------------------------------------------------
SET varname = (SELECT ...)
SET (t1, t2, t3, ...) = (SELECT c1, c2, c3 ..)
- possible identifier collision with GUC (not in multiassignment variant)
- is not too robust against typo SET (t1, t2,..) = (SELECT t2,t1, ...)
- it is ANSI/SQL design/behave
introduction new syntax proposed by plpgsql2 project
-----------------------------------------------------------------------
SELECT t1 := c1, t2 := c2, ...
- it can be PostgreSQL specific syntax - full control over design
- maximally robust against typo
- long syntax, but for short syntax can be used SELECT c1,c2,c3, .. INTO
STRICT recvar
- what should be no_data_found behave?
I have nothing about a cost of "new syntax" implementation - but for me -
it looks like good solution for us - it can be win/win solution. It breaks
nothing - it introduce nice to have typo robust syntax.
Comments, notes?
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
* EXECUTE and FOUND - this is incompatible change, extra check can be used
(test on unset variable). I see solution in leaving FOUND variable and
introduction of some new without this issue - ROW_COUNT maybe (this is
another possible incompatible change, but with higher benefit - maybe we
can introduce some aliasing, PRAGMA clause, default PRAGMAs, ..).
I checked current implementation of FOUND variable. If we introduce new
auto variable ROW_COUNT - exactly like FOUND, then it doesn't introduce any
compatibility break.
ROW_COUNT .. shortcut for GET DIAGNOSTICS row_count = ROW_COUNT.
Comments, notes?
Regards
Pavel
On 1/5/17 11:36 AM, Merlin Moncure wrote:
The C language really should be considered the gold standard here.
Changes did have to be made, like getting rid of the notoriously
broken and insecure gets(), but they were made very, very slowly and
unobtrusively.
For those not familiar... how did they accomplish that?
I'm certainly fine with changes being made very slowly. We carried the
missing_From GUC around for like a decade before ditching it. We have
other GUCs that have defaulted to not allowing silly behavior for a long
time as well. We might well need to leave the default for compatibility
GUCs set to current behavior for several more releases, to allow people
to start specifying which behavior they want.
I agree with Robert that there needs to be consensus that a change needs
to be made, but frankly I think 50% of this thread was people
disagreeing with *ANY* change that would be incompatible. IMHO that's a
ridiculous position that does not match expectations outside of plpgsql.
That kind of expectation means we have absolutely no way of fixing past
mistakes.
Certainly, there also needs to be agreement on what the new behavior
should be, but again, what I observed was an adamant insistence that
absolutely no break would be permitted.
As for using GUCs for these changes and that impact on extensions, I
don't see why that won't work for what we're discussing here. In a
worst-case scenario, extension authors would need to specify what
behavior they wanted in their extensions instead of blindly accepting
the default, by making sure those options were set for each function
they defined. While it would certainly be nice to avoid that extra work,
all the necessary infrastructure to handle that is already in place. And
if we wanted to avoid that hassle, we could allow custom GUC settings on
extensions, like we currently do for roles and databases.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/7/17 5:39 AM, Pavel Stehule wrote:
I checked current implementation of FOUND variable. If we introduce new
auto variable ROW_COUNT - exactly like FOUND, then it doesn't introduce
any compatibility break.
Except it would break every piece of code that had a row_count variable,
though I guess you could see which scoping level the variable had been
defined in.
I think the right solution in this case is to replace GET DIAGNOSTICs
with something easier to use, but I'm not sure what that would be.
I think this is another example of where not using some kind of
character to distinguish variables screws us. :/
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/7/17 2:06 AM, Pavel Stehule wrote:
SELECT t1 := c1, t2 := c2, ...
- it can be PostgreSQL specific syntax - full control over design
- maximally robust against typo
- long syntax, but for short syntax can be used SELECT c1,c2,c3, .. INTO
STRICT recvar
I don't think overloading a SQL command is a good idea. We'd be in
trouble if ANSI ever introduced :=. I think that could also conflict
with existing operators.
- what should be no_data_found behave?
Considering where we're at today, I don't think there should be a
default behavior; make the user specify somehow whether missing data is
allowed or not.
I have nothing about a cost of "new syntax" implementation - but for me
- it looks like good solution for us - it can be win/win solution. It
breaks nothing - it introduce nice to have typo robust syntax.
Related to that, I suspect we could add better support to existing
commands for at least some of these things. For example, SELECT ... INTO
NOMULTI (instead of STRICT) to indicate that multiple rows are an error
but missing data is OK.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
On 1/7/17 5:39 AM, Pavel Stehule wrote:
I checked current implementation of FOUND variable. If we introduce new
auto variable ROW_COUNT - exactly like FOUND, then it doesn't introduce
any compatibility break.
Except it would break every piece of code that had a row_count variable,
though I guess you could see which scoping level the variable had been
defined in.
If FOUND were declared at an outer scoping level such that any
user-created declaration overrode the name, then we could do likewise
for other auto variables and not fear compatibility breaks.
Currently, though, we don't seem to be quite there: it looks like
FOUND is an outer variable with respect to DECLARE blocks, but it's
more closely nested than parameter names. Compare:
regression=# create function foo1(bool) returns bool as
'declare found bool := $1; begin return found; end' language plpgsql;
CREATE FUNCTION
regression=# select foo1(true);
foo1
------
t
(1 row)
regression=# create function foo2(found bool) returns bool as
regression-# 'begin return found; end' language plpgsql;
CREATE FUNCTION
regression=# select foo2(true);
foo2
------
f
(1 row)
Not sure if changing this would be a good thing or not --- was
there reasoning behind this behavior, or was it just accidental?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/7/17 8:53 PM, Tom Lane wrote:
If FOUND were declared at an outer scoping level such that any
user-created declaration overrode the name, then we could do likewise
for other auto variables and not fear compatibility breaks.Currently, though, we don't seem to be quite there: it looks like
FOUND is an outer variable with respect to DECLARE blocks, but it's
more closely nested than parameter names.
Sorry, I'm not following... you can override a parameter name the same
way and get the same behavior, no?
BTW, I do wish you could change the label of the scope that arguments
went into, so that you could use that label to refer to function
parameters. If we allowed that it'd perhaps be the best of both worlds:
you'd be guaranteed access to all auto variables and parameters, and
that access wouldn't need to be tied to the function name (which can be
both painful and error prone).
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-08 3:53 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
On 1/7/17 5:39 AM, Pavel Stehule wrote:
I checked current implementation of FOUND variable. If we introduce new
auto variable ROW_COUNT - exactly like FOUND, then it doesn't introduce
any compatibility break.Except it would break every piece of code that had a row_count variable,
though I guess you could see which scoping level the variable had been
defined in.If FOUND were declared at an outer scoping level such that any
user-created declaration overrode the name, then we could do likewise
for other auto variables and not fear compatibility breaks.Currently, though, we don't seem to be quite there: it looks like
FOUND is an outer variable with respect to DECLARE blocks, but it's
more closely nested than parameter names. Compare:regression=# create function foo1(bool) returns bool as
'declare found bool := $1; begin return found; end' language plpgsql;
CREATE FUNCTION
regression=# select foo1(true);
foo1
------
t
(1 row)regression=# create function foo2(found bool) returns bool as
regression-# 'begin return found; end' language plpgsql;
CREATE FUNCTION
regression=# select foo2(true);
foo2
------
f
(1 row)Not sure if changing this would be a good thing or not --- was
there reasoning behind this behavior, or was it just accidental?
There are two related features in plpgsql2 project:
1. dynamic SQL sets FOUND variable
2. direct access to processed rows info via variable ROW_COUNT
@1 is incompatible change, @2 is good enough - so we should not to change
FOUND, but we can propagate ROW_COUNT instead.
Regards
Pavel
Show quoted text
regards, tom lane
2017-01-08 4:11 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/7/17 8:53 PM, Tom Lane wrote:
If FOUND were declared at an outer scoping level such that any
user-created declaration overrode the name, then we could do likewise
for other auto variables and not fear compatibility breaks.Currently, though, we don't seem to be quite there: it looks like
FOUND is an outer variable with respect to DECLARE blocks, but it's
more closely nested than parameter names.Sorry, I'm not following... you can override a parameter name the same way
and get the same behavior, no?
It is declared before any custom identifier. If you override it, then you
are working with own variable - not with auto variable.
BTW, I do wish you could change the label of the scope that arguments went
into, so that you could use that label to refer to function parameters. If
we allowed that it'd perhaps be the best of both worlds: you'd be
guaranteed access to all auto variables and parameters, and that access
wouldn't need to be tied to the function name (which can be both painful
and error prone).
We can talk about compiler directive.
PRAGMA auto_variables_label(xxxx) -- require function scope only
BEGIN
IF xxxx.FOUND THEN
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
2017-01-08 3:39 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/7/17 2:06 AM, Pavel Stehule wrote:
SELECT t1 := c1, t2 := c2, ...
- it can be PostgreSQL specific syntax - full control over design
- maximally robust against typo
- long syntax, but for short syntax can be used SELECT c1,c2,c3, .. INTO
STRICT recvarI don't think overloading a SQL command is a good idea. We'd be in trouble
if ANSI ever introduced :=. I think that could also conflict with existing
operators.
The ":=" operator is used ANSI/SQL already for named arguments. Isn't
probable so ANSI uses it in different context.
This is not overloading of SQL command - it is like annotations. It is
smart idea, so I was not surprised if ANSI/SQL reuses it. There is not any
possible construct, that is robust against typo - because assignment is
very verbose and natural.
ANSI - SQL/PSM uses two methods
1. multiassignment
SET (a,b,c) = (SELECT a, b, c ...)
2. auto variables in dedicated new scope
FOR scope_label IN SELECT a, b, c
DO
-- you can use variables a, b, c
-- you can use qualified variables scope_label.a, scope_label.b, ..
END FOR
This method is not possible in PL/pgSQL - but a work with record type is
similar
- what should be no_data_found behave?
Considering where we're at today, I don't think there should be a default
behavior; make the user specify somehow whether missing data is allowed or
not.I have nothing about a cost of "new syntax" implementation - but for me
- it looks like good solution for us - it can be win/win solution. It
breaks nothing - it introduce nice to have typo robust syntax.Related to that, I suspect we could add better support to existing
commands for at least some of these things. For example, SELECT ... INTO
NOMULTI (instead of STRICT) to indicate that multiple rows are an error but
missing data is
Another flag into NOMULTI can be solution too.
The new syntax ":=" has some advantages:
1. it robust against type - it is harder to do unwanted swap of variables,
and this mistake is very clear
2. the syntax ensure equality of target variables and source expressions.
I see valuable benefit of this syntax
Regards
Pavel
Show quoted text
OK.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
2017-01-08 3:31 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/7/17 5:39 AM, Pavel Stehule wrote:
I checked current implementation of FOUND variable. If we introduce new
auto variable ROW_COUNT - exactly like FOUND, then it doesn't introduce
any compatibility break.Except it would break every piece of code that had a row_count variable,
though I guess you could see which scoping level the variable had been
defined in.I think the right solution in this case is to replace GET DIAGNOSTICs with
something easier to use, but I'm not sure what that would be.
I invite any ideas?
Regards
Pavel
Show quoted text
I think this is another example of where not using some kind of character
to distinguish variables screws us. :/
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
BTW, I do wish you could change the label of the scope that arguments
went into, so that you could use that label to refer to function
parameters. If we allowed that it'd perhaps be the best of both worlds:
you'd be guaranteed access to all auto variables and parameters, and that
access wouldn't need to be tied to the function name (which can be both
painful and error prone).We can talk about compiler directive.
PRAGMA auto_variables_label(xxxx) -- require function scope only
If we know a list of all auto variables, then it can be on function or
block level - it can create aliases.
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
2017-01-08 3:27 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/5/17 11:36 AM, Merlin Moncure wrote:
The C language really should be considered the gold standard here.
Changes did have to be made, like getting rid of the notoriously
broken and insecure gets(), but they were made very, very slowly and
unobtrusively.For those not familiar... how did they accomplish that?
I'm certainly fine with changes being made very slowly. We carried the
missing_From GUC around for like a decade before ditching it. We have other
GUCs that have defaulted to not allowing silly behavior for a long time as
well. We might well need to leave the default for compatibility GUCs set to
current behavior for several more releases, to allow people to start
specifying which behavior they want.I agree with Robert that there needs to be consensus that a change needs
to be made, but frankly I think 50% of this thread was people disagreeing
with *ANY* change that would be incompatible. IMHO that's a ridiculous
position that does not match expectations outside of plpgsql. That kind of
expectation means we have absolutely no way of fixing past mistakes.
Certainly, there also needs to be agreement on what the new behavior
should be, but again, what I observed was an adamant insistence that
absolutely no break would be permitted.
As for using GUCs for these changes and that impact on extensions, I don't
see why that won't work for what we're discussing here. In a worst-case
scenario, extension authors would need to specify what behavior they wanted
in their extensions instead of blindly accepting the default, by making
sure those options were set for each function they defined. While it would
certainly be nice to avoid that extra work, all the necessary
infrastructure to handle that is already in place. And if we wanted to
avoid that hassle, we could allow custom GUC settings on extensions, like
we currently do for roles and databases.
The discussion related to plpgsql2, future development of plpgsql has more
levels, topics
1. incompatible changes - INTO, INTO STRICT, FOUND - there is a agreement
so current behave is not ideal for all cases, but there is not a agreement
so it broken and should be "fixed" - GUC doesn't helps here.
2. new features - the question was "how much we would to move PL/pgSQL from
verbose Ada language to different place - convention against configuration
principle", and what (if) conventions should be default. GUC can partially
helps.
I still hope so there is some space for finding a partial agreement - and
we can do some evolution steps forward.
I would not to use GUC like "We cannot to find a agreement, so we use GUC
and somebody will use this feature, some one not" - it is not way how to do
things better long time.
Jim, Marko, Joel - is there a place, features where we can find a partial
agreement? If it is, then we can move our view there.
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
Related to that, I suspect we could add better support to existing
commands for at least some of these things. For example, SELECT ... INTO
NOMULTI (instead of STRICT) to indicate that multiple rows are an error but
missing data isAnother flag into NOMULTI can be solution too.
The new syntax ":=" has some advantages:
1. it robust against type - it is harder to do unwanted swap of variables,
and this mistake is very clear
should be "against typo", sorry
Pavel
Show quoted text
2. the syntax ensure equality of target variables and source expressions.
On Sat, Jan 7, 2017 at 8:56 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Jim, Marko, Joel - is there a place, features where we can find a partial agreement? If it is, then we can move our view there.
I have decided I definitively want a new language, and I'm willing to
pay for it.
Hopefully the community will join forces and contribute with ideas and
code, but with or without you or the rest of the community, plpgsql2
is going to happen.
Call it pltrustly or plpgsql2, I don't care. I just care about ending
my suffering from being forced writing plpgsql every day. It sucks,
and I'm going to end it.
I'm just too fed up with the annoyances of plpgsql. I cannot care less
about _hypothetical_ incompatibility problems,
I think your arguments "this is like Perl6 or Python3" are delusional.
You can easily intermix plpgsql and plpgsql2 in the same
"application", something you cannot do with Perl6 or Python3. So
please stop using that as an argument.
If anyone has an application where the hypothetical incompatibility
problems would be a problem, then just continue to use plpgsql.
And please kill all these GUCs ideas. The best thing with PostgreSQL
is the natural expected behaviour of the default configuration.
Contrary to MySQL where you have to enable lots and lots of
configuration options just to get a behaviour you expect as a novice
user.
It's much better to just come together and agree on whatever we have
learned during the last 15 years of PL/pgSQL1, and sample all ideas
during a year maybe, and decide what to put into PL/pgSQL2. To make it
useful, we should aim to not break compatibility for _most_ code, but
accept some necessary rewrites of functions with deprecated
anti-patterns.
I'm even willing to suggest it might be a good idea to first try out
PL/pgSQL2 at Trustly, and after a year of usage, report back to the
community of our findings on how well it worked out for us, to allow
all others to learn from our mistakes during our first year of using
the new language. That way less people and companies will have to
suffer when we discover what we got wrong in what we thought would
work out well for us.
During the same trial period maybe your company Pavel and others can
try out their ideas of a PL/pgSQL2 and implement it, see how it works
out for you, and then report back to the community on your findings
from production environments.
That way we can avoid all these hypothetical discussions on what will
be good or bad without having any empirical evidence at hand.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/8/17 2:52 AM, Joel Jacobson wrote:
And please kill all these GUCs ideas. The best thing with PostgreSQL
is the natural expected behaviour of the default configuration.
Contrary to MySQL where you have to enable lots and lots of
configuration options just to get a behaviour you expect as a novice
user.
The only reason to use GUCs or some other kind of backwards
compatibility setting would be to allow the current plpgsql itself to
move forwards. If you think that's a dead end (which I can certainly
understand) then they make no sense at all.
It's much better to just come together and agree on whatever we have
learned during the last 15 years of PL/pgSQL1, and sample all ideas
during a year maybe, and decide what to put into PL/pgSQL2. To make it
useful, we should aim to not break compatibility for _most_ code, but
accept some necessary rewrites of functions with deprecated
anti-patterns.
If we're going to create a brand new language then I think it would be
extremely foolish to keep *any* of the current pain points around. Off
the top of my head:
- variables must have an identifier (what $ in most languages does). The
steps you have to go through to avoid simple naming collisions are insane.
- Support for composite types needs to be stronger. Off the top of my
head, you need to be able to reference an element name via a variable.
OR, maybe it'd be better to just provide a plpgsql equivalent to a dict.
- GET DIAGNOSTICS and their ilk need to die. There needs to be an easier
way to get that kind of info back (perhaps via an automatic
composite/record/dict).
- There needs to be real support for dealing with exceptions. IE: get a
composite of all exception deatils, modify parts of it, then re-raise
with the new info.
- Real support for using variables as identifiers / nothing restricted
to only accepting a Const.
- Support for the notion of a variable being unset (which is NOT the
same thing as NULL).
That said, I'll bet we still get some of that wrong, so there better be
some way of fixing those issues down the road...
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/7/17 11:44 PM, Pavel Stehule wrote:
This is not overloading of SQL command - it is like annotations. It is
smart idea, so I was not surprised if ANSI/SQL reuses it.
SHas ANSI declared that they will NEVER support := in a SELECT that's
not running in a stored function? Because if they haven't done that,
there's nothing preventing them from doing just that. If that happens
we're going to have some very difficult choices to make.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/8/17 12:03 AM, Pavel Stehule wrote:
BTW, I do wish you could change the label of the scope that
arguments went into, so that you could use that label to refer
to function parameters. If we allowed that it'd perhaps be the
best of both worlds: you'd be guaranteed access to all auto
variables and parameters, and that access wouldn't need to be
tied to the function name (which can be both painful and error
prone).We can talk about compiler directive.
PRAGMA auto_variables_label(xxxx) -- require function scope only
If we know a list of all auto variables, then it can be on function or
block level - it can create aliases.
Oh, the problem is that if you have an argument with the same name as an
auto variable you're in trouble.
Probably the easiest thing is to have a scope that sits above the scope
containing the arguments, and then allow the user to rename both scopes
if so desired. So in effect you'd end up with
<<plpgsql>> -- new scope
DECLARE
FOUND;
etc
BEGIN
<<function_name>>
DECLARE
argument_1;
argument_2;
BEGIN
-- User supplied block goes here, with optional label
END;
END;
Alternatively, we could do...
<<function_name>>
DECLARE
FOUND;
etc
BEGIN
DECLARE-- User's DECLARE
argument_1;
argomuent_2;
-- User supplied declare code
BEGIN -- User's BEGIN
....
END
That removes one level of nesting. It's probably better to go with the
first option though, since it's simpler.
In both cases, I'd really like the ability to rename those blocks.
#pragma would be fine for that.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-09 0:37 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/8/17 2:52 AM, Joel Jacobson wrote:
And please kill all these GUCs ideas. The best thing with PostgreSQL
is the natural expected behaviour of the default configuration.
Contrary to MySQL where you have to enable lots and lots of
configuration options just to get a behaviour you expect as a novice
user.The only reason to use GUCs or some other kind of backwards compatibility
setting would be to allow the current plpgsql itself to move forwards. If
you think that's a dead end (which I can certainly understand) then they
make no sense at all.It's much better to just come together and agree on whatever we have
learned during the last 15 years of PL/pgSQL1, and sample all ideas
during a year maybe, and decide what to put into PL/pgSQL2. To make it
useful, we should aim to not break compatibility for _most_ code, but
accept some necessary rewrites of functions with deprecated
anti-patterns.If we're going to create a brand new language then I think it would be
extremely foolish to keep *any* of the current pain points around. Off the
top of my head:- variables must have an identifier (what $ in most languages does). The
steps you have to go through to avoid simple naming collisions are insane.
just note - from 9.0 the collisions are not a issue
- Support for composite types needs to be stronger. Off the top of my
head, you need to be able to reference an element name via a variable. OR,
maybe it'd be better to just provide a plpgsql equivalent to a dict.
This point self needs significant code refactoring - maybe total rewriting
PL executor - it allows to change expression result data type in cycle. It
doesn't mean so I fully disagree with this point, but it is not easy to
implement it in type strict environment - C, C++, Pascal, Ada - hasn't any
similar - maybe it is possible with some libraries.
- GET DIAGNOSTICS and their ilk need to die. There needs to be an easier
way to get that kind of info back (perhaps via an automatic
composite/record/dict).
It is about performance - probably you wouldn't to fill all dict fields
after any statement.
- There needs to be real support for dealing with exceptions. IE: get a
composite of all exception deatils, modify parts of it, then re-raise with
the new info.- Real support for using variables as identifiers / nothing restricted to
only accepting a Const.
second point that enforces new PL environment - writing from scratch - and
it hides the cost of dynamic SQL
- Support for the notion of a variable being unset (which is NOT the same
thing as NULL).That said, I'll bet we still get some of that wrong, so there better be
some way of fixing those issues down the road...
With these requests you have to specify first, how much close will be your
ideal language with PostgreSQL. Currently PL/pgSQL is pretty close - with
some impacts. Your mentioned features can requires more independent
environment from Postgres.
What is really weak in plpgsql is a left side of assign statement and
missing some global/module/extensions variables.
Maybe if we integrate more PLLua or PLPython, PLPerl it can be better for
these requests.
I am not sure about benefit new only PostgreSQL specific language. What do
you think about Lua - it is light, pretty fast, dynamic, fast dictionary
API that can mask lot of internals.
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
2017-01-09 1:10 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/8/17 12:03 AM, Pavel Stehule wrote:
BTW, I do wish you could change the label of the scope that
arguments went into, so that you could use that label to refer
to function parameters. If we allowed that it'd perhaps be the
best of both worlds: you'd be guaranteed access to all auto
variables and parameters, and that access wouldn't need to be
tied to the function name (which can be both painful and error
prone).We can talk about compiler directive.
PRAGMA auto_variables_label(xxxx) -- require function scope only
If we know a list of all auto variables, then it can be on function or
block level - it can create aliases.Oh, the problem is that if you have an argument with the same name as an
auto variable you're in trouble.
I didn't well explained my idea
It is similar to your plpgsql scope. You are introducing the convention. I
proposed a explicit specification. The result is similar.
Probably the easiest thing is to have a scope that sits above the scope
containing the arguments, and then allow the user to rename both scopes if
so desired. So in effect you'd end up with<<plpgsql>> -- new scope
DECLARE
FOUND;
etc
BEGIN
<<function_name>>
DECLARE
argument_1;
argument_2;
BEGIN
-- User supplied block goes here, with optional label
END;
END;
It is similar to
PRAGMA auto_variables_namespace(plpgsql);
BEGIN
...
END;
Using PRAGMA is more verbose - it is useful for code audit, review - it is
speaking "I will overwrite some auto variables here, and I need special
namespace"
plpgsql_check, maybe plpgsql self can raise warning if these variables are
shadowed and some option/pragma is not used. Maybe current extra check does
it already.
Alternatively, we could do...
<<function_name>>
DECLARE
FOUND;
etc
BEGIN
DECLARE-- User's DECLARE
argument_1;
argomuent_2;
-- User supplied declare code
BEGIN -- User's BEGIN
....
ENDThat removes one level of nesting. It's probably better to go with the
first option though, since it's simpler.
You are forgot on function paramaters - somebody can use a function
argument like FOUND, .. So auto variables should to be declared in most
top namespace.
Usually it is invisible for users - one, two more namespaces has zero cost
for compilation and absolute zero impact for evaluation.
Show quoted text
In both cases, I'd really like the ability to rename those blocks. #pragma
would be fine for that.--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
2017-01-09 0:39 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/7/17 11:44 PM, Pavel Stehule wrote:
This is not overloading of SQL command - it is like annotations. It is
smart idea, so I was not surprised if ANSI/SQL reuses it.SHas ANSI declared that they will NEVER support := in a SELECT that's not
running in a stored function? Because if they haven't done that, there's
nothing preventing them from doing just that. If that happens we're going
to have some very difficult choices to make.
No, there is nothing declared in ANSI. But currently in ANSI is not using
one operator for two different thing.
Regards
Pavel
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
Hi
Real support for using variables as identifiers / nothing restricted to
only accepting a Const.
This point is problematic not only from performance perspective.
if you don't use some special syntax and you allow variables as identifier,
then you will got a ambiguous situation quickly - although variables can
have special symbol prefix
SELECT * FROM tab WHERE $var1 = $var3
What is $var1, what is $var2? identifier or value?
Regards
Pavel
On Sun, Jan 8, 2017 at 2:52 AM, Joel Jacobson <joel@trustly.com> wrote:
On Sat, Jan 7, 2017 at 8:56 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Jim, Marko, Joel - is there a place, features where we can find a partial agreement? If it is, then we can move our view there.
I have decided I definitively want a new language, and I'm willing to
pay for it.
well, as they say, "money talks" :-D.
Hopefully the community will join forces and contribute with ideas and
code, but with or without you or the rest of the community, plpgsql2
is going to happen.
Call it pltrustly or plpgsql2, I don't care. I just care about ending
my suffering from being forced writing plpgsql every day. It sucks,
and I'm going to end it.
Curious, are you mainly troubled by the 'INTO STRICT' family of
problems? Or something else? Pavel has scored some points with PRAGMA
syntax and ISTM that does not require compatibility break.
And please kill all these GUCs ideas. The best thing with PostgreSQL
is the natural expected behaviour of the default configuration.
Contrary to MySQL where you have to enable lots and lots of
configuration options just to get a behaviour you expect as a novice
user.
I think there is a lot of support for this point of view. Jim is
notable outlier here, but for the most part we don't do language
behavior changes with GUC.
It's much better to just come together and agree on whatever we have
learned during the last 15 years of PL/pgSQL1, and sample all ideas
during a year maybe, and decide what to put into PL/pgSQL2. To make it
useful, we should aim to not break compatibility for _most_ code, but
accept some necessary rewrites of functions with deprecated
anti-patterns.
Agreed: If you want to break compatibility, pushing a new language is
the better way than GUC. If you got consensus on this, having both
languages side by side supported for a while (maybe 4-5 releases) is
they way to go, and finally the only language is frozen and moved to
extension. But this is a lot of work and aggravation, are you *sure*
you can only get what you want with a full compatibility break?
With respect to your company developers specifically? I'm genuinely
curious if you've taken a good look at pl/v8 and why you've determined
it's not suitable to move forward with. It's got a different set of
headaches, but is really fast, and sometimes wonder if with some
alternative preprocessing (like coffeescript but geared towards SQL)
could have some long term promise.
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jan 9, 2017 at 12:37 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
If we're going to create a brand new language then I think it would be
extremely foolish to keep *any* of the current pain points around. Off the
top of my head:- variables must have an identifier (what $ in most languages does). The
steps you have to go through to avoid simple naming collisions are insane.
This is exactly what we did not want to do with this project. The idea is
to create a language which is really close to PL/PgSQL, but removes some of
the brain diarrhoea currently present.
Now, this *is* a problem, and the solution we had (well I, mostly, at this
point) in mind is to use the underscore prefix for all input variables and
make OUT parameters invisible to queries inside function bodies unless
explicitly prefixed with OUT. As far as I can tell this eliminates most
if not all collisions while staying almost completely compatible with
arguably well-written PL/PgSQL 1.
- Support for the notion of a variable being unset (which is NOT the same
thing as NULL).
My idea was that the currently unsupported combination of NOT NULL and no
DEFAULT would mean "has to be assigned to a non-NULL value before it can be
read from, or an exception is thrown". Solves the most common use case and
is backwards compatible.
.m
On 01/09/2017 06:12 PM, Merlin Moncure wrote:
With respect to your company developers specifically? I'm genuinely
curious if you've taken a good look at pl/v8 and why you've determined
it's not suitable to move forward with. It's got a different set of
headaches, but is really fast, and sometimes wonder if with some
alternative preprocessing (like coffeescript but geared towards SQL)
could have some long term promise.
Yeah, especially if built against a modern V8, with all or most of the
ES6 stuff. Without template strings and lexically scoped variables it's
very unpleasant for large functions, but with them it's usable. It's
also something a very large number of people are familiar with. As you
say it's damn fast.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/9/17 5:30 PM, Marko Tiikkaja wrote:
This is exactly what we did not want to do with this project. The idea
is to create a language which is really close to PL/PgSQL, but removes
some of the brain diarrhoea currently present.
As a general comment, ISTM it'd be much better to do as much as we can
in the current language then. It's going to take a LOT to get people to
switch to a different language, so there needs to be a LOT of added value.
Now, this *is* a problem, and the solution we had (well I, mostly, at
this point) in mind is to use the underscore prefix for all input
variables and make OUT parameters invisible to queries inside function
bodies unless explicitly prefixed with OUT. As far as I can tell this
eliminates most if not all collisions while staying almost completely
compatible with arguably well-written PL/PgSQL 1.
That might be workable... it's still rather ugly though.
I don't see prefixing everything with _ as being useful though; people
can already do that if they want to uglify the function's argument names.
I do think there's stuff that could be done along these lines with
namespaces though. Allowing users to rename the namespace that arguments
went into would be a huge step forward. I think having a separate
namespace for all the automatic variables would be a big help too.
Amusingly, that would allow users to set the namespace to '$', which
would (almost) give you $variable.
- Support for the notion of a variable being unset (which is NOT the
same thing as NULL).My idea was that the currently unsupported combination of NOT NULL and
no DEFAULT would mean "has to be assigned to a non-NULL value before it
can be read from, or an exception is thrown". Solves the most common
use case and is backwards compatible.
That won't allow you to use a variable in multiple places though... is
there a reason we couldn't support something like IS DEFINED and UNSET?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jan 10, 2017 at 12:47 AM, Jim Nasby <Jim.Nasby@bluetreble.com>
wrote:
On 1/9/17 5:30 PM, Marko Tiikkaja wrote:
My idea was that the currently unsupported combination of NOT NULL and
no DEFAULT would mean "has to be assigned to a non-NULL value before it
can be read from, or an exception is thrown". Solves the most common
use case and is backwards compatible.That won't allow you to use a variable in multiple places though... is
there a reason we couldn't support something like IS DEFINED and UNSET?
I don't understand what your use case is. Could you demonstrate that with
some code you'd write if these features were in?
.m
On 1/9/17 5:12 PM, Merlin Moncure wrote:
Agreed: If you want to break compatibility, pushing a new language is
the better way than GUC. If you got consensus on this, having both
languages side by side supported for a while (maybe 4-5 releases) is
they way to go, and finally the only language is frozen and moved to
extension. But this is a lot of work and aggravation, are you *sure*
you can only get what you want with a full compatibility break?
FWIW, that work and aggravation part is what I hoped to avoid with GUCs.
I do think that whichever route we go, we're going to be stuck
supporting the old version for a LONG time. A big part of why
standard_conforming_strings was so ugly is users didn't have enough time
to adjust. If we'd had that enabled by default for 4-5 releases it
wouldn't have been nearly as much of an issue.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/9/17 3:01 AM, Pavel Stehule wrote:
You are forgot on function paramaters - somebody can use a function
argument like FOUND, .. So auto variables should to be declared in most
top namespace.
Right, that's why I said it was an alternative. I agree it would be
better to just have 2 explicit namespaces: the top one being auto
variables and the one below that being function arguments. The namespace
below that would be the top-most *user* block.
Both of the pre-defined namespaces need the ability to change their
name; I don't see any issue with using PRAGMA for that.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/9/17 5:53 PM, Marko Tiikkaja wrote:
My idea was that the currently unsupported combination of NOT
NULL and
no DEFAULT would mean "has to be assigned to a non-NULL value
before it
can be read from, or an exception is thrown". Solves the most
common
use case and is backwards compatible.That won't allow you to use a variable in multiple places though...
is there a reason we couldn't support something like IS DEFINED and
UNSET?I don't understand what your use case is. Could you demonstrate that
with some code you'd write if these features were in?
One use case is NEW and OLD in triggers. Checking to see if one or the
other is set is easier than checking TG_OP. It's also going to be faster
(probably MUCH faster; IIRC the comparison currently happens via SPI).
Another case is selecting into a record:
EXECUTE ... INTO rec;
IF rec IS DEFINED THEN
ELSE
EXECUTE <something else> INTO rec;
IF rec IS DEFINED THEN
...
Perhaps DEFINED is not the best keyword. Ultimately I want to know if a
variable has been assigned a value, as well as being able to mark a
variable as unassigned (though arguably you might not need to be able to
un-assign...).
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jan 10, 2017 at 1:03 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 1/9/17 5:53 PM, Marko Tiikkaja wrote:
My idea was that the currently unsupported combination of NOT
NULL and
no DEFAULT would mean "has to be assigned to a non-NULL value
before it
can be read from, or an exception is thrown". Solves the most
common
use case and is backwards compatible.That won't allow you to use a variable in multiple places though...
is there a reason we couldn't support something like IS DEFINED and
UNSET?I don't understand what your use case is. Could you demonstrate that
with some code you'd write if these features were in?One use case is NEW and OLD in triggers. Checking to see if one or the
other is set is easier than checking TG_OP. It's also going to be faster
(probably MUCH faster; IIRC the comparison currently happens via SPI).
This sounds useless.
Another case is selecting into a record:
EXECUTE ... INTO rec;
IF rec IS DEFINED THEN
ELSE
EXECUTE <something else> INTO rec;
IF rec IS DEFINED THEN
And this a workaround for non-functional FOUND.
I can't get excited about this idea based on these examples.
.m
On 1/9/17 6:07 PM, Marko Tiikkaja wrote:
One use case is NEW and OLD in triggers. Checking to see if one or
the other is set is easier than checking TG_OP. It's also going to
be faster (probably MUCH faster; IIRC the comparison currently
happens via SPI).This sounds useless.
I guess you've not written much non-trivial trigger code then... the
amount of code duplication you end up with is quite ridiculous. It's
also a good example of why treating this as an exception and trapping
isn't a good solution either: you can already do that with triggers today.
Being able to check the existence of a variable is a very common idiom
in other languages, so I'm don't see why plpgsql shouldn't have it.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-10 2:02 GMT+01:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 1/9/17 6:07 PM, Marko Tiikkaja wrote:
One use case is NEW and OLD in triggers. Checking to see if one or
the other is set is easier than checking TG_OP. It's also going to
be faster (probably MUCH faster; IIRC the comparison currently
happens via SPI).This sounds useless.
I guess you've not written much non-trivial trigger code then... the
amount of code duplication you end up with is quite ridiculous. It's also a
good example of why treating this as an exception and trapping isn't a good
solution either: you can already do that with triggers today.Being able to check the existence of a variable is a very common idiom in
other languages, so I'm don't see why plpgsql shouldn't have it.
In strongly typed language like PLpgSQL is DEFINE little bit strange. On
second hand there are some elements of dynamic languages - record types and
polymorphics parameters.
Some languages has reflection API - some other like Oberon some special
statements - variable guards that allows safe casting and safe usage - it
is not far what we do with Node API.
Show quoted text
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
On 1/7/17 6:39 AM, Pavel Stehule wrote:
ROW_COUNT .. shortcut for GET DIAGNOSTICS row_count = ROW_COUNT.
I don't see the point.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-10 5:59 GMT+01:00 Peter Eisentraut <peter.eisentraut@2ndquadrant.com
:
On 1/7/17 6:39 AM, Pavel Stehule wrote:
ROW_COUNT .. shortcut for GET DIAGNOSTICS row_count = ROW_COUNT.
I don't see the point.
A check how much rows was impacted by query is relative often task. So we
can do this task more user friendly.
Second motivation - ROW_COUNT is working for static and for dynamic SQL -
it can be partial replace of FOUND variable.
But now, when I am thinking about it - it can be strange for some users
too. Pretty often we use implicit LIMIT for query execution. So ROW_COUNT
can be probably different than users expecting.
Regards
Pavel
Show quoted text
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1/10/17 12:06 AM, Pavel Stehule wrote:
A check how much rows was impacted by query is relative often task. So
we can do this task more user friendly.Second motivation - ROW_COUNT is working for static and for dynamic SQL
- it can be partial replace of FOUND variable.
What is stopping anyone from claiming that their favorite diagnostic
item is also a relatively often task and request it to become an
automatic variable? Where does it stop?
It's not like PL/pgSQL is the king of brevity. Creating inconsistent
and arbitrary warts to save a few characters does not appear appealing.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-10 14:26 GMT+01:00 Peter Eisentraut <
peter.eisentraut@2ndquadrant.com>:
On 1/10/17 12:06 AM, Pavel Stehule wrote:
A check how much rows was impacted by query is relative often task. So
we can do this task more user friendly.Second motivation - ROW_COUNT is working for static and for dynamic SQL
- it can be partial replace of FOUND variable.What is stopping anyone from claiming that their favorite diagnostic
item is also a relatively often task and request it to become an
automatic variable? Where does it stop?
There is only two possible fields - ROW_COUNT and RESULT_OID. Result Oid is
not almost unused today. So stop is ROW_COUNT
It's not like PL/pgSQL is the king of brevity. Creating inconsistent
and arbitrary warts to save a few characters does not appear appealing.
yes
Regards
Pavel
Show quoted text
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jan 10, 2017 at 2:26 PM, Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:
It's not like PL/pgSQL is the king of brevity.
This is essentially saying "PL/PgSQL isn't perfect, so we shouldn't try and
make it better". I hear this argument a lot, and as long as people keep
rejecting improvements for this reason they can keep saying it. It's a
self-fulfilling prophecy.
.m
On Tue, Jan 10, 2017 at 7:44 AM, Marko Tiikkaja <marko@joh.to> wrote:
On Tue, Jan 10, 2017 at 2:26 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:It's not like PL/pgSQL is the king of brevity.
This is essentially saying "PL/PgSQL isn't perfect, so we shouldn't try and
make it better". I hear this argument a lot, and as long as people keep
rejecting improvements for this reason they can keep saying it. It's a
self-fulfilling prophecy.
Agreed. But adding language features, especially syntactical ones,
demands prudence; there is good reason to limit keywords like that.
What about:
pgsql.rows
pgsql.found
pgsql.sqlerrm
etc
as automatic variables (I think this was suggested upthread).
Conflicts with existing structures is of course an issue but I bet it
could be worked out.
I also kinda disagree on the brevity point, or at least would like to
add some color. SQL is verbose in the sense of "let's make everything
an english language sentence" but incredibly terse relative to other
language implementations of the same task. Embedded SQL tends to be
uniformly clumsy due to all of the extra handling of errrors,
parameterization, etc. This is why we write plpgsql naturally.
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-11 15:37 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
On Tue, Jan 10, 2017 at 7:44 AM, Marko Tiikkaja <marko@joh.to> wrote:
On Tue, Jan 10, 2017 at 2:26 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:It's not like PL/pgSQL is the king of brevity.
This is essentially saying "PL/PgSQL isn't perfect, so we shouldn't try
and
make it better". I hear this argument a lot, and as long as people keep
rejecting improvements for this reason they can keep saying it. It's a
self-fulfilling prophecy.Agreed. But adding language features, especially syntactical ones,
demands prudence; there is good reason to limit keywords like that.
What about:
pgsql.rows
pgsql.found
pgsql.sqlerrm
etc
as automatic variables (I think this was suggested upthread).
Conflicts with existing structures is of course an issue but I bet it
could be worked out.
Any implicit namespace can be problem. But we can continue in default
unlabeled namespace for auto variables with possibility to specify this
namespace explicitly.
Regards
Pavel
On 1/10/17 8:44 AM, Marko Tiikkaja wrote:
On Tue, Jan 10, 2017 at 2:26 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com
<mailto:peter.eisentraut@2ndquadrant.com>> wrote:It's not like PL/pgSQL is the king of brevity.
This is essentially saying "PL/PgSQL isn't perfect, so we shouldn't try
and make it better". I hear this argument a lot, and as long as people
keep rejecting improvements for this reason they can keep saying it.
It's a self-fulfilling prophecy.
I'm not making that argument. But if the plan here is that PL/pgSQL is
too verbose, let's make it less verbose, then maybe, but let's see a
more complete plan for that.
The current syntax was chosen because it is SQL-compatible. Adding
redundant syntax to save a few characters without any new functionality
(performance, resource usage, safety, etc.) is a weak argument in the
overall scheme of things.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 11, 2017 at 11:11 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
The current syntax was chosen because it is SQL-compatible. Adding
redundant syntax to save a few characters without any new functionality
(performance, resource usage, safety, etc.) is a weak argument in the
overall scheme of things.
Yeah -- exactly. The few minor things that are not 100% SQL
compatible I find to be major headaches. Incompatible usage of INTO
for example.
This thread has been going on for quite some time now and is starting
to become somewhat circular. Perhaps we ought to organize the
various ideas and pain points presented in a wiki along with
conclusions, and in some cases if there is no solution that is
compatible with the current syntax.
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2017-01-11 20:53 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
On Wed, Jan 11, 2017 at 11:11 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:The current syntax was chosen because it is SQL-compatible. Adding
redundant syntax to save a few characters without any new functionality
(performance, resource usage, safety, etc.) is a weak argument in the
overall scheme of things.Yeah -- exactly. The few minor things that are not 100% SQL
compatible I find to be major headaches. Incompatible usage of INTO
for example.
We not designed INTO usage in plpgsql - it is PL/SQL heritage.
PL/SQL = ADA + Oracle SQL; -- but sometimes the result is not perfect - Ada
was not designed be integrated with SQL
This thread has been going on for quite some time now and is starting
to become somewhat circular. Perhaps we ought to organize the
various ideas and pain points presented in a wiki along with
conclusions, and in some cases if there is no solution that is
compatible with the current syntax.
There is a language that is much better integrated with SQL - SQL/PSM
http://postgres.cz/wiki/SQL/PSM_Manual
It is less verbose, but still verbose language. It is static typed language
- so it can be bad for some people.
But due design based on SQL integration from base, there is less conflicts
between SQL and PL.
Regards
Pavel
Show quoted text
merlin
On Mon, Jan 9, 2017 at 6:53 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
I do think that whichever route we go, we're going to be stuck supporting
the old version for a LONG time. A big part of why
standard_conforming_strings was so ugly is users didn't have enough time to
adjust. If we'd had that enabled by default for 4-5 releases it wouldn't
have been nearly as much of an issue.
/me boggles. I think you are confused about the history here.
standard_conforming_strings had a generously long phase-in period.
- The E'' syntax and the standard_conforming_strings GUC were added in
PostgreSQL 8.0. The only legal value of standard_conforming_strings
was "false".
- In PostgreSQL 8.1, it became possible to set
standard_conforming_strings to "true", but the default was still
"false".
- In PostgreSQL 9.1, the default was changed to "true".
So there 6 major release from the time the GUC was added and 5 from
the time it became mutable before the default was flipped. We've now
had 5 more since the default was changed to "true". (No, it's not
time to remove the GUC yet. At least not in my opinion.)
One thing that made changing standard_conforming_strings particularly
painful was that it had knock-on effects on many language-specific
drivers not maintained by the core project (or just plain not
maintained). I don't think the language changes being proposed here
for PL/pgsql would have the same kind of impact, but some of them
would make it significantly harder to migrate to PostgreSQL from
Oracle, which some people might see as an anti-goal (as per other
nearby threads on making that easier).
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 11, 2017 at 2:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
- The E'' syntax and the standard_conforming_strings GUC were added in
PostgreSQL 8.0. The only legal value of standard_conforming_strings
was "false".- In PostgreSQL 8.1, it became possible to set
standard_conforming_strings to "true", but the default was still
"false".- In PostgreSQL 9.1, the default was changed to "true".
So there 6 major release from the time the GUC was added and 5 from
the time it became mutable before the default was flipped. We've now
had 5 more since the default was changed to "true". (No, it's not
time to remove the GUC yet. At least not in my opinion.)One thing that made changing standard_conforming_strings particularly
painful was that it had knock-on effects on many language-specific
drivers not maintained by the core project (or just plain not
maintained). I don't think the language changes being proposed here
for PL/pgsql would have the same kind of impact, but some of them
would make it significantly harder to migrate to PostgreSQL from
Oracle, which some people might see as an anti-goal (as per other
nearby threads on making that easier).
I don't think it's a simple matter of waiting N or N+M releases
(although I certainly did appreciate that we did it regardless). It
comes down to this: there's just no way to release changes that break
a lot of code without breaking a lot of code. It's all about
acknowledging that and judging it acceptable against the benefits you
get. For posterity, with respect to conforming strings, SQL
injection is an absolute scourge of the computing world so on balance
we did the right thing. Having said that, It's always good to do the
math and the calculation is primarily an economic one, I think,
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/11/17 12:07 PM, Pavel Stehule wrote:
PL/SQL = ADA + Oracle SQL; -- but sometimes the result is not perfect -
Ada was not designed be integrated with SQL
...
There is a language that is much better integrated with SQL - SQL/PSM
I think it is worth considering ways to increase compatibility with
plsql, as well as pulling PSM into core. The former would be to help
migrating from Oracle; the latter would be to provide everyone a cleaner
built-in PL. (IMHO a PLSQL equivalent could certainly be an external
extension).
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers