support for CREATE MODULE

Started by Swaha Millerover 4 years ago33 messageshackers

swaha.miller@gmail.com

over 4 years ago

Hi,

I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]/messages/by-id/CAB_5SRebSCjO12=nLsaLCBw2vnkiNH7jcNchirPc0yQ2KmiknQ@mail.gmail.com

My proposal implements modules as schema objects to be stored in a new
system catalog pg_module with new syntax for CREATE [OR REPLACE] MODULE,
ALTER MODULE, DROP MODULE and for GRANT and REVOKE for privileges on
modules and module routines. I am attempting to follow the SQL spec.
However, for right now, I'm proposing to support only routines as module
contents, with local temporary tables and path specifications as defined
in the SQL spec, to be supported in a future submission. We could also
include support for variables depending on its status. [2]/messages/by-id/CAFj8pRD053CY_N4=6SvPe7ke6xPbh=K50LUAOwjC3jm8Me9Obg@mail.gmail.com

Following are some examples of what the new module syntax would look
like. The attached patch has detailed documentation.

CREATE MODULE mtest1 CREATE FUNCTION m1testa() RETURNS text
LANGUAGE sql
RETURN '1x';
SELECT mtest1.m1testa();
ALTER MODULE mtest1 CREATE FUNCTION m1testd() RETURNS text
LANGUAGE sql
RETURN 'm1testd';
SELECT mtest1.m1testd();
ALTER MODULE mtest1 RENAME TO mtest1renamed;
SELECT mtest1renamed.m1testd();
REVOKE ON MODULE mtest1 REFERENCES ON FUNCTION m1testa() FROM public;
GRANT ON MODULE mtest1 REFERENCES ON FUNCTION m1testa() TO
regress_priv_user1;

I am new to the PostgreSQL community and would really appreciate your
input and feedback.

Thanks,
Swaha Miller
Amazon Web Services

[1]: /messages/by-id/CAB_5SRebSCjO12=nLsaLCBw2vnkiNH7jcNchirPc0yQ2KmiknQ@mail.gmail.com
/messages/by-id/CAB_5SRebSCjO12=nLsaLCBw2vnkiNH7jcNchirPc0yQ2KmiknQ@mail.gmail.com

[2]: /messages/by-id/CAFj8pRD053CY_N4=6SvPe7ke6xPbh=K50LUAOwjC3jm8Me9Obg@mail.gmail.com
/messages/by-id/CAFj8pRD053CY_N4=6SvPe7ke6xPbh=K50LUAOwjC3jm8Me9Obg@mail.gmail.com

Pavel Stehule

pavel.stehule@gmail.com

over 4 years ago

In reply to: Swaha Miller (#1)

Re: support for CREATE MODULE

čt 3. 2. 2022 v 3:28 odesílatel Swaha Miller <swaha.miller@gmail.com>
napsal:

Hi,

I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]

My proposal implements modules as schema objects to be stored in a new
system catalog pg_module with new syntax for CREATE [OR REPLACE] MODULE,
ALTER MODULE, DROP MODULE and for GRANT and REVOKE for privileges on
modules and module routines. I am attempting to follow the SQL spec.
However, for right now, I'm proposing to support only routines as module
contents, with local temporary tables and path specifications as defined
in the SQL spec, to be supported in a future submission. We could also
include support for variables depending on its status. [2]

Following are some examples of what the new module syntax would look
like. The attached patch has detailed documentation.

CREATE MODULE mtest1 CREATE FUNCTION m1testa() RETURNS text
LANGUAGE sql
RETURN '1x';
SELECT mtest1.m1testa();
ALTER MODULE mtest1 CREATE FUNCTION m1testd() RETURNS text
LANGUAGE sql
RETURN 'm1testd';
SELECT mtest1.m1testd();
ALTER MODULE mtest1 RENAME TO mtest1renamed;
SELECT mtest1renamed.m1testd();
REVOKE ON MODULE mtest1 REFERENCES ON FUNCTION m1testa() FROM public;
GRANT ON MODULE mtest1 REFERENCES ON FUNCTION m1testa() TO
regress_priv_user1;

I am new to the PostgreSQL community and would really appreciate your
input and feedback.

I dislike this feature. The modules are partially redundant to schemas and
to extensions in Postgres, and I am sure, so there is no reason to
introduce this.

What is the benefit against schemas and extensions?

Regards

Pavel

Show quoted text

Thanks,
Swaha Miller
Amazon Web Services

[1]
/messages/by-id/CAB_5SRebSCjO12=nLsaLCBw2vnkiNH7jcNchirPc0yQ2KmiknQ@mail.gmail.com

[2]
/messages/by-id/CAFj8pRD053CY_N4=6SvPe7ke6xPbh=K50LUAOwjC3jm8Me9Obg@mail.gmail.com

Julien Rouhaud

rjuju123@gmail.com

over 4 years ago

In reply to: Pavel Stehule (#2)

Re: support for CREATE MODULE

Hi,

On Thu, Feb 03, 2022 at 05:25:27AM +0100, Pavel Stehule wrote:

čt 3. 2. 2022 v 3:28 odesílatel Swaha Miller <swaha.miller@gmail.com>
napsal:

Hi,

I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]

My proposal implements modules as schema objects to be stored in a new
system catalog pg_module with new syntax for CREATE [OR REPLACE] MODULE,
ALTER MODULE, DROP MODULE and for GRANT and REVOKE for privileges on
modules and module routines. I am attempting to follow the SQL spec.
However, for right now, I'm proposing to support only routines as module
contents, with local temporary tables and path specifications as defined
in the SQL spec, to be supported in a future submission. We could also
include support for variables depending on its status. [2]

I dislike this feature. The modules are partially redundant to schemas and
to extensions in Postgres, and I am sure, so there is no reason to
introduce this.

What is the benefit against schemas and extensions?

I agree with Pavel. It seems that it's mainly adding another namespacing layer
between schemas and objects, and it's likely going to create a mess.
That's also going to be problematic if you want to add support for module
variables, as you won't be able to use e.g.
dbname.schemaname.modulename.variablename.fieldname.

Also, my understanding was that the major interest of modules (at least for the
routines part) was the ability to make some of them private to the module, but
it doesn't look like it's doing that, so I also don't see any real benefit
compared to schemas and extensions.

Pavel Stehule

pavel.stehule@gmail.com

over 4 years ago

In reply to: Julien Rouhaud (#3)

Re: support for CREATE MODULE

čt 3. 2. 2022 v 5:46 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:

Hi,

On Thu, Feb 03, 2022 at 05:25:27AM +0100, Pavel Stehule wrote:

čt 3. 2. 2022 v 3:28 odesílatel Swaha Miller <swaha.miller@gmail.com>
napsal:

Hi,

I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]

My proposal implements modules as schema objects to be stored in a new
system catalog pg_module with new syntax for CREATE [OR REPLACE]

MODULE,

ALTER MODULE, DROP MODULE and for GRANT and REVOKE for privileges on
modules and module routines. I am attempting to follow the SQL spec.
However, for right now, I'm proposing to support only routines as

module

contents, with local temporary tables and path specifications as

defined

in the SQL spec, to be supported in a future submission. We could also
include support for variables depending on its status. [2]

I dislike this feature. The modules are partially redundant to schemas

and

to extensions in Postgres, and I am sure, so there is no reason to
introduce this.

What is the benefit against schemas and extensions?

I agree with Pavel. It seems that it's mainly adding another namespacing
layer
between schemas and objects, and it's likely going to create a mess.
That's also going to be problematic if you want to add support for module
variables, as you won't be able to use e.g.
dbname.schemaname.modulename.variablename.fieldname.

Also, my understanding was that the major interest of modules (at least
for the
routines part) was the ability to make some of them private to the module,
but
it doesn't look like it's doing that, so I also don't see any real benefit
compared to schemas and extensions.

The biggest problem is coexistence of Postgres's SEARCH_PATH object
identification, and local and public scopes used in MODULEs or in Oracle's
packages.

I can imagine MODULES as third level of database unit object grouping with
following functionality

1. It should support all database objects like schemas
2. all public objects should be accessed directly when outer schema is in
SEARCH_PATH
3. private objects cannot be accessed from other modules
4. modules should be movable between schemas, databases without a loss of
functionality
5. modules should to support renaming without loss of functionality
6. there should be redefined some rules of visibility, because there can be
new identifier's collisions and ambiguities
7. there should be defined relation of modules's objects and schema's
objects. Maybe an introduction of the default module can be a good idea.

I had the opportunity to see a man who modified routines in pgAdmin. It can
be hell, but if we introduce a new concept (and it is an important
concept), then there should be strong benefits - for example - possibility
of strong encapsulation of code inside modules (or some units - the name is
not important).

The problem with pgAdmin maybe can be solved better by adding some
annotations to database objects that allows more user friendly organization
in the object tree in pgAdmin (and similar tools). Maybe it can be useful
to have more tries (defined by locality, semantic, quality, ...).

Regards

Pavel

Swaha Miller

swaha.miller@gmail.com

over 4 years ago

In reply to: Pavel Stehule (#4)

Re: support for CREATE MODULE

Thank you for the feedback Pavel and Julien. I'll try to explain some of
the issues and points you raise to the best of my understanding.

The reason for modules is that it would serve as an organizational unit
that can allow setting permissions on those units. So, for example, all
functions in a module can be subject to setting access permissions on for
some user(s) or group(s). I didn't explain it well in the sgml docs, but
along with module syntax, I'm proposing introducing privileges to
grant/revoke on modules and routines in modules. And why modules for this
purpose? Because its in the SQL spec so seems like a way to do it.

I'm adding comments inline for the list of functionality you mentioned. I
look forward to discussing this more and figuring out how to make a useful
contribution to the community.

On Wed, Feb 2, 2022 at 11:22 PM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

čt 3. 2. 2022 v 5:46 odesílatel Julien Rouhaud <rjuju123@gmail.com>
napsal:

Hi,

On Thu, Feb 03, 2022 at 05:25:27AM +0100, Pavel Stehule wrote:

čt 3. 2. 2022 v 3:28 odesílatel Swaha Miller <swaha.miller@gmail.com>
napsal:

Hi,

I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]

My proposal implements modules as schema objects to be stored in a new
system catalog pg_module with new syntax for CREATE [OR REPLACE]

MODULE,

ALTER MODULE, DROP MODULE and for GRANT and REVOKE for privileges on
modules and module routines. I am attempting to follow the SQL spec.
However, for right now, I'm proposing to support only routines as

module

contents, with local temporary tables and path specifications as

defined

in the SQL spec, to be supported in a future submission. We could also
include support for variables depending on its status. [2]

I dislike this feature. The modules are partially redundant to schemas

and

to extensions in Postgres, and I am sure, so there is no reason to
introduce this.

What is the benefit against schemas and extensions?

I agree with Pavel. It seems that it's mainly adding another namespacing
layer
between schemas and objects, and it's likely going to create a mess.
That's also going to be problematic if you want to add support for module
variables, as you won't be able to use e.g.
dbname.schemaname.modulename.variablename.fieldname.

I haven't yet added support for variables so will need to look into the
problems with this if we're going to do that.

Also, my understanding was that the major interest of modules (at least

for the
routines part) was the ability to make some of them private to the
module, but
it doesn't look like it's doing that, so I also don't see any real benefit
compared to schemas and extensions.

Yes, that is indeed the goal/use-case with setting permissions with grant
and revoke. Right now, I have proposed create and reference as the kinds of
access that can be controlled on modules, and reference as the kind of
access that can be controlled on routines inside modules.

The biggest problem is coexistence of Postgres's SEARCH_PATH object
identification, and local and public scopes used in MODULEs or in Oracle's
packages.

I am not extremely familiar with Oracle's packages, but do know of them.
I'm wondering if local and public scopes for MODULE is in the SQL spec? (I
will check for that...) My thinking was to implement functionality that
conforms to the SQL spec, not try to match Oracle's package which differs
from the spec in some ways.

I can imagine MODULES as third level of database unit object grouping with
following functionality

1. It should support all database objects like schemas

Do you mean that schemas should be groupable under modules? My thinking was
to follow what the SQL spec says about what objects should be in modules,
and I started with routines as one of the objects that there are use cases
for. Such a controlling access permissions on routines at some granularity
that is not an entire schema and not individual functions/procedures.

2. all public objects should be accessed directly when outer schema is in
SEARCH_PATH

Yes, an object inside a module is in a schema and can be accessed with
schemaname.func() as well as modulename.func() as well as
schemaname.modulename.func(). I think you are saying it should be
accessible with func() without a qualifying schemaname or modulename if the
schemaname is in the search path, and that sounds reasonable too. Unless,
of course, func() was created in a module, in which case access permissions
for the module and module contents will determine whether func() should be
directly accessible. In my current proposal, a previously created func()
can't be added to a module created later. The purpose of creating routines
inside a module (either when the module is created or after the module is
created) would be with the intent of setting access permissions on those
routines differently than for the outer schema.

3. private objects cannot be accessed from other modules

Yes, I hope that is going to be the case with setting permissions with
grant and revoke. Right now, I have proposed create and reference as the
kinds of access that can be controlled on modules, and reference as the
kind of access that can be controlled on routines inside modules.

4. modules should be movable between schemas, databases without a loss of
functionality

pg_dump will dump modules so that can provide ways of moving them between
databases. I hadn't envisioned moving modules between schemas, but can
think of ways that can be supported. Would the objects within the modules
also move implicitly to the new schema?

5. modules should to support renaming without loss of functionality

yes renaming of modules is supported in my proposal

6. there should be redefined some rules of visibility, because there can
be new identifier's collisions and ambiguities

I'm not sure I understand this point. Can you please explain more?

7. there should be defined relation of modules's objects and schema's
objects. Maybe an introduction of the default module can be a good idea.

I was thinking of module as a unit of organization (with the goal of
controlling access to it) of objects that are still in some schema, and the
module itself as an object that is also in a schema.

I had the opportunity to see a man who modified routines in pgAdmin. It
can be hell, but if we introduce a new concept (and it is an important
concept), then there should be strong benefits - for example - possibility
of strong encapsulation of code inside modules (or some units - the name is
not important).

The problem with pgAdmin maybe can be solved better by adding some
annotations to database objects that allows more user friendly organization
in the object tree in pgAdmin (and similar tools). Maybe it can be useful
to have more tries (defined by locality, semantic, quality, ...).

Regards

Pavel

Best regards,
Swaha

Pavel Stehule

pavel.stehule@gmail.com

over 4 years ago

In reply to: Swaha Miller (#5)

Re: support for CREATE MODULE

čt 3. 2. 2022 v 20:21 odesílatel Swaha Miller <swaha.miller@gmail.com>
napsal:

Thank you for the feedback Pavel and Julien. I'll try to explain some of
the issues and points you raise to the best of my understanding.

The reason for modules is that it would serve as an organizational unit
that can allow setting permissions on those units. So, for example, all
functions in a module can be subject to setting access permissions on for
some user(s) or group(s). I didn't explain it well in the sgml docs, but
along with module syntax, I'm proposing introducing privileges to
grant/revoke on modules and routines in modules. And why modules for this
purpose? Because its in the SQL spec so seems like a way to do it.

This part of the standard is dead - there is no strong reason to implement
it.

I'm adding comments inline for the list of functionality you mentioned. I
look forward to discussing this more and figuring out how to make a useful
contribution to the community.

On Wed, Feb 2, 2022 at 11:22 PM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

čt 3. 2. 2022 v 5:46 odesílatel Julien Rouhaud <rjuju123@gmail.com>
napsal:

Hi,

On Thu, Feb 03, 2022 at 05:25:27AM +0100, Pavel Stehule wrote:

čt 3. 2. 2022 v 3:28 odesílatel Swaha Miller <swaha.miller@gmail.com>
napsal:

Hi,

I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]

My proposal implements modules as schema objects to be stored in a

new

system catalog pg_module with new syntax for CREATE [OR REPLACE]

MODULE,

ALTER MODULE, DROP MODULE and for GRANT and REVOKE for privileges on
modules and module routines. I am attempting to follow the SQL spec.
However, for right now, I'm proposing to support only routines as

module

contents, with local temporary tables and path specifications as

defined

in the SQL spec, to be supported in a future submission. We could

also

include support for variables depending on its status. [2]

I dislike this feature. The modules are partially redundant to schemas

and

to extensions in Postgres, and I am sure, so there is no reason to
introduce this.

What is the benefit against schemas and extensions?

I agree with Pavel. It seems that it's mainly adding another
namespacing layer
between schemas and objects, and it's likely going to create a mess.
That's also going to be problematic if you want to add support for module
variables, as you won't be able to use e.g.
dbname.schemaname.modulename.variablename.fieldname.

I haven't yet added support for variables so will need to look into the
problems with this if we're going to do that.

Also, my understanding was that the major interest of modules (at least

for the
routines part) was the ability to make some of them private to the
module, but
it doesn't look like it's doing that, so I also don't see any real
benefit
compared to schemas and extensions.

Yes, that is indeed the goal/use-case with setting permissions with grant
and revoke. Right now, I have proposed create and reference as the kinds of
access that can be controlled on modules, and reference as the kind of
access that can be controlled on routines inside modules.

The biggest problem is coexistence of Postgres's SEARCH_PATH object
identification, and local and public scopes used in MODULEs or in Oracle's
packages.

I am not extremely familiar with Oracle's packages, but do know of them.
I'm wondering if local and public scopes for MODULE is in the SQL spec? (I
will check for that...) My thinking was to implement functionality that
conforms to the SQL spec, not try to match Oracle's package which differs
from the spec in some ways.

I can imagine MODULES as third level of database unit object grouping
with following functionality

1. It should support all database objects like schemas

Do you mean that schemas should be groupable under modules? My thinking
was to follow what the SQL spec says about what objects should be in
modules, and I started with routines as one of the objects that there are
use cases for. Such a controlling access permissions on routines at some
granularity that is not an entire schema and not individual
functions/procedures.

SQLspec says so there can be just temporary tables and routines. It is
useless. Unfortunately SQL/PSM came too late and there is no progress in
the last 20 years. It is a dead horse.

2. all public objects should be accessed directly when outer schema is in
SEARCH_PATH

Yes, an object inside a module is in a schema and can be accessed with
schemaname.func() as well as modulename.func() as well as
schemaname.modulename.func(). I think you are saying it should be
accessible with func() without a qualifying schemaname or modulename if the
schemaname is in the search path, and that sounds reasonable too. Unless,
of course, func() was created in a module, in which case access permissions
for the module and module contents will determine whether func() should be
directly accessible. In my current proposal, a previously created func()
can't be added to a module created later. The purpose of creating routines
inside a module (either when the module is created or after the module is
created) would be with the intent of setting access permissions on those
routines differently than for the outer schema.

3. private objects cannot be accessed from other modules

Yes, I hope that is going to be the case with setting permissions with
grant and revoke. Right now, I have proposed create and reference as the
kinds of access that can be controlled on modules, and reference as the
kind of access that can be controlled on routines inside modules.

The permission is not enough strategy - if I implement some private
objects in the module, and I push this module to the schema on the search
path, the private objects need to be invisible. I don't want to allow
shadowing of public objects by private objects.

4. modules should be movable between schemas, databases without a loss of
functionality

pg_dump will dump modules so that can provide ways of moving them between
databases. I hadn't envisioned moving modules between schemas, but can
think of ways that can be supported. Would the objects within the modules
also move implicitly to the new schema?

I thought more about extending the CREATE EXTENSION command to support
modules.

5. modules should to support renaming without loss of functionality

yes renaming of modules is supported in my proposal

But if I call a module function from the same module, this should work
after renaming. That's mean so there should be some mechanism how to
implement routine call without necessity to use absolute path

6. there should be redefined some rules of visibility, because there can
be new identifier's collisions and ambiguities

I'm not sure I understand this point. Can you please explain more?

I can have function fx in schema s, and then I can have module s in public
schema with function fx. What will be called when I write s.fx() ?

7. there should be defined relation of modules's objects and schema's
objects. Maybe an introduction of the default module can be a good idea.

I was thinking of module as a unit of organization (with the goal of
controlling access to it) of objects that are still in some schema, and the
module itself as an object that is also in a schema.

I understand, but just this is not enough benefit for implementation, when
Postgres supports schemas and extensions already. The benefit can be better
encapsulation or better isolation than we have with schemas.

Show quoted text

I had the opportunity to see a man who modified routines in pgAdmin. It
can be hell, but if we introduce a new concept (and it is an important
concept), then there should be strong benefits - for example - possibility
of strong encapsulation of code inside modules (or some units - the name is
not important).

The problem with pgAdmin maybe can be solved better by adding some
annotations to database objects that allows more user friendly organization
in the object tree in pgAdmin (and similar tools). Maybe it can be useful
to have more tries (defined by locality, semantic, quality, ...).

Regards

Pavel

Best regards,
Swaha

Alvaro Herrera

alvherre@2ndquadrant.com

over 4 years ago

In reply to: Pavel Stehule (#4)

Re: support for CREATE MODULE

On 2022-Feb-03, Pavel Stehule wrote:

The biggest problem is coexistence of Postgres's SEARCH_PATH object
identification, and local and public scopes used in MODULEs or in Oracle's
packages.

I can imagine MODULES as third level of database unit object grouping with
following functionality

1. It should support all database objects like schemas

I proposed a way for modules to coexist with schemas that got no reply,
/messages/by-id/202106021908.ddmebx7qfdld@alvherre.pgsql

I still think that that idea is valuable; it would let us create
"private" routines, for example, which are good for encapsulation.
But the way it interacts with schemas means we don't end up with a total
mess in the namespace resolution rules. I argued that modules would
only have functions, and maybe a few other useful object types, but not
*all* object types, because we don't need all object types to become
private. For example, I don't think I would like to have data types or
casts to be private, so they can only be in a schema and they cannot be
in a module.

Of course, that idea of modules would also ease porting large DB-based
applications from other database systems.

What do others think?

--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/

Julien Rouhaud

rjuju123@gmail.com

over 4 years ago

In reply to: Alvaro Herrera (#7)

Re: support for CREATE MODULE

Hi,

On Thu, Feb 03, 2022 at 10:42:32PM -0300, Alvaro Herrera wrote:

On 2022-Feb-03, Pavel Stehule wrote:

The biggest problem is coexistence of Postgres's SEARCH_PATH object
identification, and local and public scopes used in MODULEs or in Oracle's
packages.

I can imagine MODULES as third level of database unit object grouping with
following functionality

1. It should support all database objects like schemas

I proposed a way for modules to coexist with schemas that got no reply,
/messages/by-id/202106021908.ddmebx7qfdld@alvherre.pgsql

Ah, sorry I missed this one.

I still think that that idea is valuable; it would let us create
"private" routines, for example, which are good for encapsulation.
But the way it interacts with schemas means we don't end up with a total
mess in the namespace resolution rules.

I argued that modules would
only have functions, and maybe a few other useful object types, but not
*all* object types, because we don't need all object types to become
private. For example, I don't think I would like to have data types or
casts to be private, so they can only be in a schema and they cannot be
in a module.

Of course, that idea of modules would also ease porting large DB-based
applications from other database systems.

What do others think?

This approach seems way better as it indeed fixes the qualification issues with
the patch proposed in this thread.

Bruce Momjian

bruce@momjian.us

over 4 years ago

In reply to: Swaha Miller (#1)

Re: support for CREATE MODULE

On Wed, Feb 2, 2022 at 06:28:30PM -0800, Swaha Miller wrote:

Hi,

I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]

My proposal implements modules as schema objects to be stored in a new
system catalog pg_module with new syntax for CREATE [OR REPLACE] MODULE,

You might want to consider the steps that are most successful at getting
Postgres patches accepted:

https://wiki.postgresql.org/wiki/Todo
Desirability -> Design -> Implement -> Test -> Review -> Commit

In this case, you have jumped right to Implement. Asking about
Desirability first can avoid a lot of effort.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#10

Swaha Miller

swaha.miller@gmail.com

over 4 years ago

In reply to: Alvaro Herrera (#7)

Re: support for CREATE MODULE

On Thu, Feb 3, 2022 at 5:42 PM Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:

On 2022-Feb-03, Pavel Stehule wrote:

The biggest problem is coexistence of Postgres's SEARCH_PATH object
identification, and local and public scopes used in MODULEs or in

Oracle's

packages.

I can imagine MODULES as third level of database unit object grouping

with

following functionality

1. It should support all database objects like schemas

I proposed a way for modules to coexist with schemas that got no reply,
/messages/by-id/202106021908.ddmebx7qfdld@alvherre.pgsql

Yes, I arrived a little after that thread, and used that as my starting
point.

The POC patch Jim Mlodgenski had on that thread was similar to your
proposed
way - modules were rows in pg_namespace, with the addition of a new column
in pg_namespace for the nspkind (module or schema.)

Jim had asked about two options -- the above approach and an alternative
one
of having a pg_module system catalog and got some input
/messages/by-id/2897116.1622642280@sss.pgh.pa.us

Picking up from there, I am exploring the alternative approach. And what
qualified
names would look like and how they get interpreted unambiguously, when
schemas and modules co-exist. (Also, being new to PostgreSQL, it has been a
great learning exercise for me on some of the internals of PostgreSQL.)

With modules being their own type of object stored in a pg_module system
catalog, deconstructing a qualified path could always give precedence to
schema over module. So when there is function f() in schema s and another
function f() in module s in schema public, then s.f() would invoke the
former.

What are some other directions we might want to take this patch?

I still think that that idea is valuable; it would let us create

"private" routines, for example, which are good for encapsulation.
But the way it interacts with schemas means we don't end up with a total
mess in the namespace resolution rules. I argued that modules would
only have functions, and maybe a few other useful object types, but not
*all* object types, because we don't need all object types to become
private. For example, I don't think I would like to have data types or
casts to be private, so they can only be in a schema and they cannot be
in a module.

Of course, that idea of modules would also ease porting large DB-based
applications from other database systems.

Indeed. Looking at commercial databases Oracle and Microsoft SQLServer,
they both have implemented module-like structures.

Swaha Miller

#11

Swaha Miller

swaha.miller@gmail.com

over 4 years ago

In reply to: Bruce Momjian (#9)

Re: support for CREATE MODULE

On Fri, Feb 4, 2022 at 10:46 AM Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Feb 2, 2022 at 06:28:30PM -0800, Swaha Miller wrote:

Hi,

I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]

My proposal implements modules as schema objects to be stored in a new
system catalog pg_module with new syntax for CREATE [OR REPLACE] MODULE,

You might want to consider the steps that are most successful at getting
Postgres patches accepted:

https://wiki.postgresql.org/wiki/Todo
Desirability -> Design -> Implement -> Test -> Review -> Commit

In this case, you have jumped right to Implement. Asking about
Desirability first can avoid a lot of effort.

Thanks Bruce, that's really helpful. I was building on the discussion in
Jim's
original thread, which is why I went ahead with another POC implementation,
but I do consider this implementation as part of the desirability/design
aspect
and am hoping to get input from the community to shape this proposal/patch.

Swaha Miller

#12

Alvaro Herrera

alvherre@2ndquadrant.com

over 4 years ago

In reply to: Swaha Miller (#10)

Re: support for CREATE MODULE

On 2022-Feb-04, Swaha Miller wrote:

The POC patch Jim Mlodgenski had on that thread was similar to your
proposed way - modules were rows in pg_namespace, with the addition of
a new column in pg_namespace for the nspkind (module or schema.)

I don't agree that what he proposed was similar to my proposal. The
main problem I saw in his proposal is that he was saying that modules
would be *within* schemas, which is where I think the whole thing
derailed completely.

He said:

[ This patch ] [...] allows for 3-part (or 4 with the database name)
naming of objects within the module.

He then showed the following example:

CREATE SCHEMA foo;
CREATE MODULE foo.bar
CREATE FUNCTION hello() [...]
SELECT foo.bar.hello();

Notice the three-part name there. That's a disaster, because the name
resolution rules become very complicated or ambiguous. What I describe
avoids that disaster, by forcing there to be two-part names only: a
module lives on its own, not in a schema, so a function name always has
at most two parts (never three), and the first part can always be
resolved down to a pg_namespace row of some kind.

--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
"La conclusión que podemos sacar de esos estudios es que
no podemos sacar ninguna conclusión de ellos" (Tanenbaum)

#13

Tom Lane

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: Alvaro Herrera (#12)

Re: support for CREATE MODULE

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

He said:

[ This patch ] [...] allows for 3-part (or 4 with the database name)
naming of objects within the module.

He then showed the following example:

CREATE SCHEMA foo;
CREATE MODULE foo.bar
CREATE FUNCTION hello() [...]
SELECT foo.bar.hello();

Notice the three-part name there. That's a disaster, because the name
resolution rules become very complicated or ambiguous.

Right. We've looked into that before --- when I made pg_namespace,
I called it that because I thought we might be able to support
nested namespaces --- but it'd really create a mess. In particular,
the SQL standard says what a three-part name means, and this ain't it.

If we invent modules I think they need to work more like extensions
naming-wise, ie they group objects but have no effect for naming.
Alternatively, you could insist that a module *is* a schema for naming
purposes, with some extra properties but acting exactly like a schema
for naming. But I don't see what that buys you that you can't get
from an extension that owns a schema that contains all its objects.

On the whole I'm kind of allergic to inventing an entire new concept
that has as much overlap with extensions as modules seem to. I'd
rather try to understand what functional requirements we're missing
and see if we can add them to extensions. Yeah, we won't end up being
bug-compatible with Oracle's feature, but that's not a project goal
anyway --- and where we have tried to emulate Oracle closely, it's
often not worked out well (poster child: to_date).

regards, tom lane

#14

Jim Mlodgenski

jimmy76@gmail.com

over 4 years ago

In reply to: Tom Lane (#13)

Re: support for CREATE MODULE

On Fri, Feb 4, 2022 at 5:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

On the whole I'm kind of allergic to inventing an entire new concept
that has as much overlap with extensions as modules seem to. I'd
rather try to understand what functional requirements we're missing
and see if we can add them to extensions. Yeah, we won't end up being
bug-compatible with Oracle's feature, but that's not a project goal
anyway --- and where we have tried to emulate Oracle closely, it's
often not worked out well (poster child: to_date).

Developers need a way to group related objects in some fashion so
that they can be more easily reasoned about. Modules are just the
way to do this in the spec, but if we want to leverage extensions,
that will work too. Many users who need this only have access through
a database connection. They wouldn't have access to the file system
to add a control file nor a script to add the objects. Enhancing
CREATE EXTENSION to be able to create some sort of empty extension
and then having the ability to add and remove objects from that
extension may be the minimum amount of functionality we would need
to give users the ability to group their objects.

#15

Nathan Bossart

nathandbossart@gmail.com

over 4 years ago

In reply to: Tom Lane (#13)

Re: support for CREATE MODULE

On Fri, Feb 04, 2022 at 05:12:43PM -0500, Tom Lane wrote:

If we invent modules I think they need to work more like extensions
naming-wise, ie they group objects but have no effect for naming.
Alternatively, you could insist that a module *is* a schema for naming
purposes, with some extra properties but acting exactly like a schema
for naming. But I don't see what that buys you that you can't get
from an extension that owns a schema that contains all its objects.

On the whole I'm kind of allergic to inventing an entire new concept
that has as much overlap with extensions as modules seem to. I'd
rather try to understand what functional requirements we're missing
and see if we can add them to extensions. Yeah, we won't end up being
bug-compatible with Oracle's feature, but that's not a project goal
anyway --- and where we have tried to emulate Oracle closely, it's
often not worked out well (poster child: to_date).

If I'm understanding correctly, you are suggesting that CREATE MODULE would
be more like creating an extension without a control file, installation
script, etc. Objects would be added aѕ members with something like ALTER
MODULE ADD, and members could share properties such as access control. And
this might be possible to do by enhancing CREATE EXTENSION instead of
creating a new catalog, dependency type, etc.

I think this could be a nice way to sidestep the naming resolution problems
discussed upthread while still allowing folks to group objects together in
some meaningful way. Also, while it might be nice to have separate CREATE
EXTENSION and CREATE MODULE commands, perhaps they would use roughly the
same code paths behind the scenes.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#16

Tom Lane

tgl@sss.pgh.pa.us

over 4 years ago

In reply to: Nathan Bossart (#15)

Re: support for CREATE MODULE

Nathan Bossart <nathandbossart@gmail.com> writes:

On Fri, Feb 04, 2022 at 05:12:43PM -0500, Tom Lane wrote:

On the whole I'm kind of allergic to inventing an entire new concept
that has as much overlap with extensions as modules seem to. I'd
rather try to understand what functional requirements we're missing
and see if we can add them to extensions. Yeah, we won't end up being
bug-compatible with Oracle's feature, but that's not a project goal
anyway --- and where we have tried to emulate Oracle closely, it's
often not worked out well (poster child: to_date).

If I'm understanding correctly, you are suggesting that CREATE MODULE would
be more like creating an extension without a control file, installation
script, etc. Objects would be added aѕ members with something like ALTER
MODULE ADD, and members could share properties such as access control. And
this might be possible to do by enhancing CREATE EXTENSION instead of
creating a new catalog, dependency type, etc.

I think this could be a nice way to sidestep the naming resolution problems
discussed upthread while still allowing folks to group objects together in
some meaningful way. Also, while it might be nice to have separate CREATE
EXTENSION and CREATE MODULE commands, perhaps they would use roughly the
same code paths behind the scenes.

Hm. If the functional requirement is "group objects without needing
any out-in-the-filesystem infrastructure", then I could see defining
a module as being exactly like an extension except there's no such
infrastructure --- and hence no concept of versions, plus pg_dump
needs to act differently. That's probably enough semantic difference
to justify using a separate word, even if we can share a lot of
code infrastructure.

regards, tom lane

#17

Peter Eisentraut

peter_e@gmx.net

over 4 years ago

In reply to: Tom Lane (#13)

Re: support for CREATE MODULE

On 04.02.22 23:12, Tom Lane wrote:

Right. We've looked into that before --- when I made pg_namespace,
I called it that because I thought we might be able to support
nested namespaces --- but it'd really create a mess. In particular,
the SQL standard says what a three-part name means, and this ain't it.

Modules are part of the SQL standard, so there is surely some
name-resolution system specified there as well.

#18

Swaha Miller

swaha.miller@gmail.com

over 4 years ago

In reply to: Tom Lane (#16)

Re: support for CREATE MODULE

On Fri, Feb 4, 2022 at 3:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm. If the functional requirement is "group objects without needing
any out-in-the-filesystem infrastructure", then I could see defining
a module as being exactly like an extension except there's no such
infrastructure --- and hence no concept of versions, plus pg_dump
needs to act differently. That's probably enough semantic difference
to justify using a separate word, even if we can share a lot of
code infrastructure.

Then as a first cut for modules, could we add CREATE MODULE
syntax which adds an entry to pg_extension like CREATE EXTENSION
does? And also add a new column to pg_extension to distinguish
modules from extensions.

The three-part path name resolution for functions would remain the
same, nothing would need to change there because of modules.

Would that be an acceptable direction to go?

Swaha

#19

Bruce Momjian

bruce@momjian.us

over 4 years ago

In reply to: Swaha Miller (#18)

Re: support for CREATE MODULE

On Thu, Feb 10, 2022 at 08:53:15AM -0800, Swaha Miller wrote:

On Fri, Feb 4, 2022 at 3:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm. If the functional requirement is "group objects without needing
any out-in-the-filesystem infrastructure", then I could see defining
a module as being exactly like an extension except there's no such
infrastructure --- and hence no concept of versions, plus pg_dump
needs to act differently. That's probably enough semantic difference
to justify using a separate word, even if we can share a lot of
code infrastructure.

Then as a first cut for modules, could we add CREATE MODULE
syntax which adds an entry to pg_extension like CREATE EXTENSION
does? And also add a new column to pg_extension to distinguish
modules from extensions.

The three-part path name resolution for functions would remain the
same, nothing would need to change there because of modules.

Would that be an acceptable direction to go?

Well, that would allow us to have CREATE EXTENSION syntax, but what
would it do that CREATE SCHEMA does not?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#20

Alvaro Herrera

alvherre@2ndquadrant.com

over 4 years ago

In reply to: Tom Lane (#13)

Re: support for CREATE MODULE

On 2022-Feb-04, Tom Lane wrote:

If we invent modules I think they need to work more like extensions
naming-wise, ie they group objects but have no effect for naming.

I think modules are an orthogonal concept to extensions, and trying to
mix both is messy.

The way I see modules working is as a "logical" grouping of objects --
they provide encapsulated units of functionality. A module has private
functions, which cannot be called except from other functions in the
same module. You can abstract them out of the database design, leaving
you with only the exposed functions, the public API.

An extension is a way to distribute or package objects. An extension
can contain a module, and of course you should be able to use an
extension to distribute a module, or even several modules. In fact, I
think it should be possible to have several extensions contribute
different objects to the same module.

But things like name resolution rules (search path) are not affected by
how the objects are distributed, whereas the search path is critical in
how you think about the objects in a module.

If modules are just going to be extensions, I see no point.

--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/

#21