Re: Final version of IDENTITY/GENERATED patch

Started by Zoltan Boszormenyiabout 19 years ago62 messageshackers
Jump to latest
#1Zoltan Boszormenyi
zboszor@dunaweb.hu

Resending compressed, it seems pgsql-patches
doesn't let me post so large patches.

Zoltan Boszormenyi �rta:

Show quoted text

Hi,

I have finished my GENERATED/IDENTITY patch
and now it does everything I wanted it to do. Changes
from the previous version:

- GENERATED columns work now
- extended testcase to test some GENERATED expressions
- extended documentation

Now it comes in an uncompressed context diff form,
out of habit I sent unified diffs before, sorry for that.
It applies to current CVS.

Please, review.

Thanks in advance,
Zolt�n B�sz�rm�nyi

Attachments:

psql-serial-33.diff.gzapplication/x-tar; name=psql-serial-33.diff.gzDownload
#2Zoltan Boszormenyi
zboszor@dunaweb.hu
In reply to: Zoltan Boszormenyi (#1)

Hi,

I think now this is really the final version.

Changes in this version is:
- when dropping a column that's referenced
by a GENERATED column, the GENERATED
column has to be also dropped. It's required by SQL:2003.
- COPY table FROM works correctly with IDENTITY
and GENERATED columns
- extended testcase to show the above two

To reiterate all the features that accumulated
over time, here's the list:

- extended catalog (pg_attribute) to keep track whether
the column is IDENTITY or GENERATED
- working GENERATED column that may reference
other regular columns; it extends the DEFAULT
infrastructure to allow storing complex expressions;
syntax for such columns:
colname type GENERATED ALWAYS AS ( expression )
- working IDENTITY column whose value is generated
after all other columns (regular or GENERATED)
are assigned with values and validated via their
NOT NULL and CHECK constraints; this allows
tighter numbering - the only case when there may be
missing serials are when UNIQUE indexes are failed
(which is checked on heap_insert() and heap_update()
and is a tougher nut to crack)
syntax is:
colname type GENERATED { ALWAYS | BY DEFAULT }
AS IDENTITY [ ( sequence options ) ]
the original SERIAL pseudo-type is left unmodified, the IDENTITY
concept is new and extends on it - PostgreSQL may have multiple
SERIAL columns in a table, but SQL:2003 requires that at most
one IDENITY column may exist in a table at any time
- Implemented the following TODOs:
- %Have ALTER TABLE RENAME rename SERIAL sequence names
- Allow SERIAL sequences to inherit permissions from the base table?
Actually the roles that have INSERT or UPDATE permissions
on the table gain permission on the sequence, too.
This makes the following TODO unneeded:
- Add DEFAULT .. AS OWNER so permission checks are done as the table owner
This would be useful for SERIAL nextval() calls and CHECK constraints.
- DROP DEFAULT is prohibited on GENERATED and IDENTITY columns
- One SERIAL column can be upgraded to IDENTITY via
ALTER COLUMN column SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
Same for downgrading, via:
ALTER COLUMN column DROP IDENTITY
- COPY and INSERT may use OVERRIDING SYSTEM VALUE
clause to override automatic generation and allow
to import dumped data unmodified
- Update is forbidden for GENERATED ALWAYS AS IDENTITY
columns entirely and for GENERATED ALWAYS AS (expr)
columns for other values than DEFAULT.
- ALTER COLUMN SET <sequence options> for
altering the supporting sequence; works on any
SERIAL-like or IDENTITY columns
- ALTER COLUMN RESTART [WITH] N
for changing only the next generated number in the
sequence.
- The essence of pg_get_serial_sequence() is exported
as get_relid_att_serial_sequence() to be used internally
by checks.
- CHECK constraints cannot reference IDENTITY or
GENERATED columns
- GENERATED columns cannot reference IDENTITY or
GENERATED columns
- dropping a column that's referenced by a GENERATED column
also drops the GENERATED column
- pg_dump dumps correct schema for IDENTITY and
GENERATED columns:
- ALTER COLUMN SET GENERATED ... AS IDENTITY
for IDENTITY columns after ALTER SEQUENCE OWNED BY
- correct GENERATED AS ( expression ) caluse in the table schema
- pg_dump dumps COPY OVERRIDING SYSTEM VALUE
for tables' date that have any GENERATED or
GENERATED ALWAYS AS IDENTITY columns.
- documentation and testcases

Please, review.

Best regards,
Zolt�n B�sz�rm�nyi

Attachments:

psql-serial-34.diff.gzapplication/x-tar; name=psql-serial-34.diff.gzDownload
#3Bruce Momjian
bruce@momjian.us
In reply to: Zoltan Boszormenyi (#2)

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Zoltan Boszormenyi wrote:

Hi,

I think now this is really the final version.

Changes in this version is:
- when dropping a column that's referenced
by a GENERATED column, the GENERATED
column has to be also dropped. It's required by SQL:2003.
- COPY table FROM works correctly with IDENTITY
and GENERATED columns
- extended testcase to show the above two

To reiterate all the features that accumulated
over time, here's the list:

- extended catalog (pg_attribute) to keep track whether
the column is IDENTITY or GENERATED
- working GENERATED column that may reference
other regular columns; it extends the DEFAULT
infrastructure to allow storing complex expressions;
syntax for such columns:
colname type GENERATED ALWAYS AS ( expression )
- working IDENTITY column whose value is generated
after all other columns (regular or GENERATED)
are assigned with values and validated via their
NOT NULL and CHECK constraints; this allows
tighter numbering - the only case when there may be
missing serials are when UNIQUE indexes are failed
(which is checked on heap_insert() and heap_update()
and is a tougher nut to crack)
syntax is:
colname type GENERATED { ALWAYS | BY DEFAULT }
AS IDENTITY [ ( sequence options ) ]
the original SERIAL pseudo-type is left unmodified, the IDENTITY
concept is new and extends on it - PostgreSQL may have multiple
SERIAL columns in a table, but SQL:2003 requires that at most
one IDENITY column may exist in a table at any time
- Implemented the following TODOs:
- %Have ALTER TABLE RENAME rename SERIAL sequence names
- Allow SERIAL sequences to inherit permissions from the base table?
Actually the roles that have INSERT or UPDATE permissions
on the table gain permission on the sequence, too.
This makes the following TODO unneeded:
- Add DEFAULT .. AS OWNER so permission checks are done as the table owner
This would be useful for SERIAL nextval() calls and CHECK constraints.
- DROP DEFAULT is prohibited on GENERATED and IDENTITY columns
- One SERIAL column can be upgraded to IDENTITY via
ALTER COLUMN column SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
Same for downgrading, via:
ALTER COLUMN column DROP IDENTITY
- COPY and INSERT may use OVERRIDING SYSTEM VALUE
clause to override automatic generation and allow
to import dumped data unmodified
- Update is forbidden for GENERATED ALWAYS AS IDENTITY
columns entirely and for GENERATED ALWAYS AS (expr)
columns for other values than DEFAULT.
- ALTER COLUMN SET <sequence options> for
altering the supporting sequence; works on any
SERIAL-like or IDENTITY columns
- ALTER COLUMN RESTART [WITH] N
for changing only the next generated number in the
sequence.
- The essence of pg_get_serial_sequence() is exported
as get_relid_att_serial_sequence() to be used internally
by checks.
- CHECK constraints cannot reference IDENTITY or
GENERATED columns
- GENERATED columns cannot reference IDENTITY or
GENERATED columns
- dropping a column that's referenced by a GENERATED column
also drops the GENERATED column
- pg_dump dumps correct schema for IDENTITY and
GENERATED columns:
- ALTER COLUMN SET GENERATED ... AS IDENTITY
for IDENTITY columns after ALTER SEQUENCE OWNED BY
- correct GENERATED AS ( expression ) caluse in the table schema
- pg_dump dumps COPY OVERRIDING SYSTEM VALUE
for tables' date that have any GENERATED or
GENERATED ALWAYS AS IDENTITY columns.
- documentation and testcases

Please, review.

Best regards,
Zolt?n B?sz?rm?nyi

[ application/x-tar is not supported, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#4Zoltan Boszormenyi
zboszor@dunaweb.hu
In reply to: Bruce Momjian (#3)

Hi!

Thanks.

However, in the meantime I made some changes
so the IDENTITY column only advances its sequence
if it fails its CHECK constraints or UNIQUE indexes.
I still have some work with expression indexes.
Should I post an incremental patch against this version
or a full patch when it's ready?
An incremental patch can still be posted when the feature
is agreed to be in 8.3 and actually applied. It only changes
some details in the new feature and doesn't change
behaviour of existing features.

Best regards,
Zolt�n B�sz�rm�nyi

Bruce Momjian �rta:

Show quoted text

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Zoltan Boszormenyi wrote:

Hi,

I think now this is really the final version.

Changes in this version is:
- when dropping a column that's referenced
by a GENERATED column, the GENERATED
column has to be also dropped. It's required by SQL:2003.
- COPY table FROM works correctly with IDENTITY
and GENERATED columns
- extended testcase to show the above two

To reiterate all the features that accumulated
over time, here's the list:

- extended catalog (pg_attribute) to keep track whether
the column is IDENTITY or GENERATED
- working GENERATED column that may reference
other regular columns; it extends the DEFAULT
infrastructure to allow storing complex expressions;
syntax for such columns:
colname type GENERATED ALWAYS AS ( expression )
- working IDENTITY column whose value is generated
after all other columns (regular or GENERATED)
are assigned with values and validated via their
NOT NULL and CHECK constraints; this allows
tighter numbering - the only case when there may be
missing serials are when UNIQUE indexes are failed
(which is checked on heap_insert() and heap_update()
and is a tougher nut to crack)
syntax is:
colname type GENERATED { ALWAYS | BY DEFAULT }
AS IDENTITY [ ( sequence options ) ]
the original SERIAL pseudo-type is left unmodified, the IDENTITY
concept is new and extends on it - PostgreSQL may have multiple
SERIAL columns in a table, but SQL:2003 requires that at most
one IDENITY column may exist in a table at any time
- Implemented the following TODOs:
- %Have ALTER TABLE RENAME rename SERIAL sequence names
- Allow SERIAL sequences to inherit permissions from the base table?
Actually the roles that have INSERT or UPDATE permissions
on the table gain permission on the sequence, too.
This makes the following TODO unneeded:
- Add DEFAULT .. AS OWNER so permission checks are done as the table owner
This would be useful for SERIAL nextval() calls and CHECK constraints.
- DROP DEFAULT is prohibited on GENERATED and IDENTITY columns
- One SERIAL column can be upgraded to IDENTITY via
ALTER COLUMN column SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
Same for downgrading, via:
ALTER COLUMN column DROP IDENTITY
- COPY and INSERT may use OVERRIDING SYSTEM VALUE
clause to override automatic generation and allow
to import dumped data unmodified
- Update is forbidden for GENERATED ALWAYS AS IDENTITY
columns entirely and for GENERATED ALWAYS AS (expr)
columns for other values than DEFAULT.
- ALTER COLUMN SET <sequence options> for
altering the supporting sequence; works on any
SERIAL-like or IDENTITY columns
- ALTER COLUMN RESTART [WITH] N
for changing only the next generated number in the
sequence.
- The essence of pg_get_serial_sequence() is exported
as get_relid_att_serial_sequence() to be used internally
by checks.
- CHECK constraints cannot reference IDENTITY or
GENERATED columns
- GENERATED columns cannot reference IDENTITY or
GENERATED columns
- dropping a column that's referenced by a GENERATED column
also drops the GENERATED column
- pg_dump dumps correct schema for IDENTITY and
GENERATED columns:
- ALTER COLUMN SET GENERATED ... AS IDENTITY
for IDENTITY columns after ALTER SEQUENCE OWNED BY
- correct GENERATED AS ( expression ) caluse in the table schema
- pg_dump dumps COPY OVERRIDING SYSTEM VALUE
for tables' date that have any GENERATED or
GENERATED ALWAYS AS IDENTITY columns.
- documentation and testcases

Please, review.

Best regards,
Zolt?n B?sz?rm?nyi

[ application/x-tar is not supported, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

#5Bruce Momjian
bruce@momjian.us
In reply to: Zoltan Boszormenyi (#4)

Zoltan Boszormenyi wrote:

Hi!

Thanks.

However, in the meantime I made some changes
so the IDENTITY column only advances its sequence
if it fails its CHECK constraints or UNIQUE indexes.
I still have some work with expression indexes.
Should I post an incremental patch against this version
or a full patch when it's ready?

Full patch.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#6Zoltan Boszormenyi
zboszor@dunaweb.hu
In reply to: Bruce Momjian (#5)
IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Hi

Bruce Momjian �rta:

Zoltan Boszormenyi wrote:

Hi!

Thanks.

However, in the meantime I made some changes
so the IDENTITY column only advances its sequence
if it fails its CHECK constraints or UNIQUE indexes.
I still have some work with expression indexes.
Should I post an incremental patch against this version
or a full patch when it's ready?

Full patch.

Then here it is. Now it's really finished, I promise. :-)
Changes:

- unique index checks are done in two steps
to avoid inflating the sequence if a unique index check
is failed that doesn't reference the IDENTITY column
- to minimize runtime impact of checking whether
an index references the IDENTITY column and skipping it
in the first step in ExecInsertIndexTuples(), I introduced
a new attribute in the pg_index catalog. I had to place it
in the middle of the fixed size attributes because I had
mysterious crashes otherwise. This means the attributes
are renumbered. This attribute is determined during
CREATE INDEX and recomputed for all indexes defined
on the table during ALTER TABLE SET/DROP IDENTITY.
- as a consequence, IDENTITY/GENERATED can now
have CHECK constraints, this limit was removed.
- modified testcase for the above changes
- reworded documentation

Please, review.

Best regards,
Zolt�n B�sz�rm�nyi

Attachments:

psql-serial-36.diff.gzapplication/x-tar; name=psql-serial-36.diff.gzDownload
#7Bruce Momjian
bruce@momjian.us
In reply to: Zoltan Boszormenyi (#6)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Zoltan Boszormenyi wrote:

Hi

Bruce Momjian ?rta:

Zoltan Boszormenyi wrote:

Hi!

Thanks.

However, in the meantime I made some changes
so the IDENTITY column only advances its sequence
if it fails its CHECK constraints or UNIQUE indexes.
I still have some work with expression indexes.
Should I post an incremental patch against this version
or a full patch when it's ready?

Full patch.

Then here it is. Now it's really finished, I promise. :-)
Changes:

- unique index checks are done in two steps
to avoid inflating the sequence if a unique index check
is failed that doesn't reference the IDENTITY column
- to minimize runtime impact of checking whether
an index references the IDENTITY column and skipping it
in the first step in ExecInsertIndexTuples(), I introduced
a new attribute in the pg_index catalog. I had to place it
in the middle of the fixed size attributes because I had
mysterious crashes otherwise. This means the attributes
are renumbered. This attribute is determined during
CREATE INDEX and recomputed for all indexes defined
on the table during ALTER TABLE SET/DROP IDENTITY.
- as a consequence, IDENTITY/GENERATED can now
have CHECK constraints, this limit was removed.
- modified testcase for the above changes
- reworded documentation

Please, review.

Best regards,
Zolt?n B?sz?rm?nyi

[ application/x-tar is not supported, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zoltan Boszormenyi (#6)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:

[ IDENTITY/GENERATED patch ]

I got around to reviewing this today.

- unique index checks are done in two steps
to avoid inflating the sequence if a unique index check
is failed that doesn't reference the IDENTITY column

This is just not acceptable --- there is nothing in the standard that
requires such behavior, and I dislike the wide-ranging kluges you
introduced to support it. Please get rid of that and put the behavior
back into ordinary DEFAULT-substitution where it belongs.

- to minimize runtime impact of checking whether
an index references the IDENTITY column and skipping it
in the first step in ExecInsertIndexTuples(), I introduced
a new attribute in the pg_index catalog.

This is likewise unreasonably complex and fragile ... but it
goes away anyway if you remove the above, no?

The patch appears to believe that OVERRIDING SYSTEM VALUE should be
restricted to the table owner, but I don't actually see any support
for that in the SQL2003 spec ... where did you get that from?

I'm pretty dubious about the kluges in aclchk.c to automatically
grant/revoke on dependent sequences --- particularly the "revoke"
part. The problem with that is that it breaks if the same sequence
is being used to feed multiple tables.

User-facing errors need to be ereport() not elog() so that they
can be translated and have appropriate SQLSTATEs reported.
elog is only for internal errors.

One other thought is that the field names based on force_default
seemed confusing. I'd suggest that maybe "generated" would be
a better name choice.

Please fix and resubmit.

regards, tom lane

#9Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#8)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Tom Lane �rta:

Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:

[ IDENTITY/GENERATED patch ]

I got around to reviewing this today.

Thanks for the review.
Sorry for the bit late reply, I was ill and
then occupied with some other work.

- unique index checks are done in two steps
to avoid inflating the sequence if a unique index check
is failed that doesn't reference the IDENTITY column

This is just not acceptable --- there is nothing in the standard that
requires such behavior,

But also there is nothing that would say not to do it. :-)
And this way, there is be nothing that would separate
IDENTITY from regular SERIALs only the slightly
later value generation. The behaviour I proposed
would be a big usability plus over the standard
with less possible skipped values.

and I dislike the wide-ranging kluges you
introduced to support it.

Can you see any other way to avoid skipping sequence values
as much as possible?

Please get rid of that and put the behavior
back into ordinary DEFAULT-substitution where it belongs.

You mean put the IDENTITY generation into rewriteTargetList()?
And what about the "action at a distance" behaviour
you praised so much before? (Which made the less-skipping
behaviour implementable...) Anyway, I put it back.

But it brought the consequence that GENERATED fields
may reference IDENTITY columns, too, so I removed
this limitation as well.

- to minimize runtime impact of checking whether
an index references the IDENTITY column and skipping it
in the first step in ExecInsertIndexTuples(), I introduced
a new attribute in the pg_index catalog.

This is likewise unreasonably complex and fragile ... but it
goes away anyway if you remove the above, no?

Yes.

The patch appears to believe that OVERRIDING SYSTEM VALUE should be
restricted to the table owner, but I don't actually see any support
for that in the SQL2003 spec ... where did you get that from?

Somehow it felt wrong to allow everybody to use it.
Limit removed.

I'm pretty dubious about the kluges in aclchk.c to automatically
grant/revoke on dependent sequences --- particularly the "revoke"
part. The problem with that is that it breaks if the same sequence
is being used to feed multiple tables.

OK, removed.

User-facing errors need to be ereport() not elog() so that they
can be translated and have appropriate SQLSTATEs reported.
elog is only for internal errors.

OK, changed.

One other thought is that the field names based on force_default
seemed confusing. I'd suggest that maybe "generated" would be
a better name choice.

I modified the names. force_default -> is_identity, attforceddef ->
attgenerated

I also fixed COPY without OVERRIDING SYSTEM VALUE
regarding IDENTITY and GENERATED fields and modified
the docs and the testcase according to your requested modifications.

Please fix and resubmit.
regards, tom lane

Thanks again for the review.
Here's the new version with the modifications you requested.

--
----------------------------------
Zolt�n B�sz�rm�nyi
Cybertec Geschwinde & Sch�nig GmbH
http://www.postgresql.at/

Attachments:

psql-serial-38.diff.gzapplication/x-tar; name=psql-serial-38.diff.gzDownload
#10Andrew Dunstan
andrew@dunslane.net
In reply to: Boszormenyi Zoltan (#9)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Zoltan Boszormenyi wrote:

Tom Lane �rta:

- unique index checks are done in two steps
to avoid inflating the sequence if a unique index check
is failed that doesn't reference the IDENTITY column

This is just not acceptable --- there is nothing in the standard that
requires such behavior,

But also there is nothing that would say not to do it. :-)
And this way, there is be nothing that would separate
IDENTITY from regular SERIALs only the slightly
later value generation. The behaviour I proposed
would be a big usability plus over the standard
with less possible skipped values.

and I dislike the wide-ranging kluges you
introduced to support it.

Can you see any other way to avoid skipping sequence values
as much as possible?

This doesn't strike me as a sensible design goal. Why not just live with
skipped values?

cheers

andrew

#11Boszormenyi Zoltan
zb@cybertec.at
In reply to: Andrew Dunstan (#10)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Andrew Dunstan �rta:

Zoltan Boszormenyi wrote:

Tom Lane �rta:

- unique index checks are done in two steps
to avoid inflating the sequence if a unique index check
is failed that doesn't reference the IDENTITY column

This is just not acceptable --- there is nothing in the standard that
requires such behavior,

But also there is nothing that would say not to do it. :-)
And this way, there is be nothing that would separate
IDENTITY from regular SERIALs only the slightly
later value generation. The behaviour I proposed
would be a big usability plus over the standard
with less possible skipped values.

and I dislike the wide-ranging kluges you
introduced to support it.

Can you see any other way to avoid skipping sequence values
as much as possible?

This doesn't strike me as a sensible design goal. Why not just live
with skipped values?

cheers

andrew

The idea wouldn't have considered to cross my mind
if Tom didn't mention the action-at-a-distance behaviour.
If you look back in the archives, my first working
IDENTITY was nothing more than the syntactic sugar
over the regular SERIAL.

--
----------------------------------
Zolt�n B�sz�rm�nyi
Cybertec Geschwinde & Sch�nig GmbH
http://www.postgresql.at/

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Boszormenyi Zoltan (#11)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Zoltan Boszormenyi <zb@cybertec.at> writes:

The idea wouldn't have considered to cross my mind
if Tom didn't mention the action-at-a-distance behaviour.

AFAIR, that remark had to do with NEXT VALUE FOR, which SQL2003 defines
in a very weird way (it's not equivalent to nextval() as one would wish).
I don't see anything strange in the spec for GENERATED.

regards, tom lane

#13Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#12)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Tom Lane �rta:

Zoltan Boszormenyi <zb@cybertec.at> writes:

The idea wouldn't have considered to cross my mind
if Tom didn't mention the action-at-a-distance behaviour.

AFAIR, that remark had to do with NEXT VALUE FOR, which SQL2003 defines
in a very weird way (it's not equivalent to nextval() as one would wish).
I don't see anything strange in the spec for GENERATED.

regards, tom lane

Thanks for clarifying.
Please review the version I sent you.

--
----------------------------------
Zolt�n B�sz�rm�nyi
Cybertec Geschwinde & Sch�nig GmbH
http://www.postgresql.at/

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Boszormenyi Zoltan (#9)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Zoltan Boszormenyi <zb@cybertec.at> writes:

Here's the new version with the modifications you requested.

I see another problem with this patch: the code added to
ATExecDropColumn is a crude hack. It doesn't work anyway since this is
not the only possible way for columns to be dropped (another one that
comes to mind immediately is DROP TYPE ... CASCADE). The only correct
way to handle things is to let the dependency mechanism do it. I think
you would get the behavior you want if you make the generated columns
have AUTO rather than NORMAL dependencies on the columns they reference.

regards, tom lane

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#14)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

I wrote:

I see another problem with this patch: the code added to
ATExecDropColumn is a crude hack. It doesn't work anyway since this is
not the only possible way for columns to be dropped (another one that
comes to mind immediately is DROP TYPE ... CASCADE). The only correct
way to handle things is to let the dependency mechanism do it.

Actually, the whole question of dependencies for generated columns
probably needs some thought. What should happen if a function or
operator used in a GENERATED expression gets dropped? The result
for a normal column's default expression is that the default expression
goes away, but the column is still there. I suspect we don't want
that for a GENERATED column --- it would then be effectively a plain
column.

Along the same lines, is ALTER COLUMN DROP DEFAULT a legal operation
on a generated column? What about just replacing the expression with
ALTER COLUMN SET DEFAULT?

Before you get too excited about making generated columns disappear
automatically in all these cases, consider that dropping a column
is not something to be done lightly --- it might contain irreplaceable
data.

On second thought maybe the right approach is just to allow the default
expression to be dropped the same as it would be for an ordinary column,
and to make sure that if a GENERATED column doesn't (currently) have a
default, it is treated the same as an ordinary column.

This leads to the further thought that maybe GENERATED is not a property
of a column at all, but of its default (ie, it should be stored in
pg_attrdef not pg_attribute, which would certainly make the patch less
messy). AFAICS plain GENERATED merely indicates that the default
expression can depend on other columns, which is certainly a property
of the default --- you could imagine ALTER COLUMN SET DEFAULT GENERATED
AS ... to make a formerly plain column into a GENERATED one. I'm not
entirely sure about ALWAYS though.

regards, tom lane

#16Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#15)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Tom Lane �rta:

I wrote:

I see another problem with this patch: the code added to
ATExecDropColumn is a crude hack. It doesn't work anyway since this is
not the only possible way for columns to be dropped (another one that
comes to mind immediately is DROP TYPE ... CASCADE). The only correct
way to handle things is to let the dependency mechanism do it.

I will try that.

Actually, the whole question of dependencies for generated columns
probably needs some thought. What should happen if a function or
operator used in a GENERATED expression gets dropped? The result
for a normal column's default expression is that the default expression
goes away, but the column is still there. I suspect we don't want
that for a GENERATED column --- it would then be effectively a plain
column.

No, I would want the DROP FUNCTION to be cancelled if used in
a GENERATED, but a DROP ... CASCADE would drop it, too.
So, DEPENDENCY_NORMAL will keep the referencing object
but DEPENDENCY_AUTO would drop it too if the referenced
object is dropped?

Along the same lines, is ALTER COLUMN DROP DEFAULT a legal operation
on a generated column? What about just replacing the expression with
ALTER COLUMN SET DEFAULT?

Neither of these options are legal for GENERATED columns,
AFAIK I prohibited them already.

Before you get too excited about making generated columns disappear
automatically in all these cases, consider that dropping a column
is not something to be done lightly --- it might contain irreplaceable
data.

The standard says that the GENERATED column should be
dropped silently if either of the referenced columns is dropped.
I haven't seen anything about the expression, though.

On second thought maybe the right approach is just to allow the default
expression to be dropped the same as it would be for an ordinary column,
and to make sure that if a GENERATED column doesn't (currently) have a
default, it is treated the same as an ordinary column.

This leads to the further thought that maybe GENERATED is not a property
of a column at all, but of its default (ie, it should be stored in
pg_attrdef not pg_attribute, which would certainly make the patch less
messy). AFAICS plain GENERATED merely indicates that the default
expression can depend on other columns, which is certainly a property
of the default --- you could imagine ALTER COLUMN SET DEFAULT GENERATED
AS ... to make a formerly plain column into a GENERATED one. I'm not
entirely sure about ALWAYS though.

The standard says somewhere that GENERATED columns
can only be added to and dropped from a table.

My observation is: I deleted my hack from ATExecDropColumn()
and now I cannot drop the referenced column without CASCADE.
The comment in StoreAttrDefault() says the objects in the expression
will have dependencies registered. I guess "objects" also means functions?
This way, if I do explicit recordDependencyOn(, DEPENDENCY_AUTO)
on referenced columns then the standard requirement will
be satisfied, i.e. dropping columns will drop GENERATED columns
silently that reference the said column but . Am I right? Or how about using
recordDependencyOnSingleRelExpr(... , DEP_NORMAL, DEP_AUTO ) ?

regards, tom lane

--
----------------------------------
Zolt�n B�sz�rm�nyi
Cybertec Geschwinde & Sch�nig GmbH
http://www.postgresql.at/

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Boszormenyi Zoltan (#16)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Zoltan Boszormenyi <zb@cybertec.at> writes:

Before you get too excited about making generated columns disappear
automatically in all these cases, consider that dropping a column
is not something to be done lightly --- it might contain irreplaceable
data.

The standard says that the GENERATED column should be
dropped silently if either of the referenced columns is dropped.

[ itch... ] I think a pretty good case could be made for ignoring that
provision, on the grounds that it's a foot-gun, and that it's not very
important to follow the standard slavishly on this point because it's
hard to conceive of any application actually relying on that behavior.

You could probably implement the auto-drop behavior with some combination
of (a) AUTO rather than NORMAL dependencies from the default expression
to the stuff it depends on and (b) INTERNAL rather than AUTO dependency
from the default expression to its column. But I really question
whether this is a good idea.

The standard says somewhere that GENERATED columns
can only be added to and dropped from a table.

This argument carries no weight at all --- there is plenty of stuff in
PG that is an extension of the capabilities listed in the spec.

regards, tom lane

#18Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#17)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Tom Lane �rta:

Zoltan Boszormenyi <zb@cybertec.at> writes:

Before you get too excited about making generated columns disappear
automatically in all these cases, consider that dropping a column
is not something to be done lightly --- it might contain irreplaceable
data.

The standard says that the GENERATED column should be
dropped silently if either of the referenced columns is dropped.

[ itch... ] I think a pretty good case could be made for ignoring that
provision, on the grounds that it's a foot-gun, and that it's not very
important to follow the standard slavishly on this point because it's
hard to conceive of any application actually relying on that behavior.

You could probably implement the auto-drop behavior with some combination
of (a) AUTO rather than NORMAL dependencies from the default expression
to the stuff it depends on and (b) INTERNAL rather than AUTO dependency
from the default expression to its column. But I really question
whether this is a good idea.

So, all dependency should be NORMAL to require
manual CASCADE to avoid accidental data loss.

I have two questions about the dependency system.

1. Is there a built-in defense to avoid circular dependencies?

2. If I register dependencies between column, is there a way
to retrieve all table/column type dependencies for a depender column?

What I would like to achieve is to lift the limit that
a GENERATED column cannot reference another one.
Only self-referencing should be disallowed.

The standard says somewhere that GENERATED columns
can only be added to and dropped from a table.

This argument carries no weight at all --- there is plenty of stuff in
PG that is an extension of the capabilities listed in the spec.

Point taken. So, just like with SET / DROP IDENTITY,
I should implement SET GENERATED ALWAYS
and DROP GENERATED.

regards, tom lane

--
----------------------------------
Zolt�n B�sz�rm�nyi
Cybertec Geschwinde & Sch�nig GmbH
http://www.postgresql.at/

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Boszormenyi Zoltan (#18)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Zoltan Boszormenyi <zb@cybertec.at> writes:

I have two questions about the dependency system.

1. Is there a built-in defense to avoid circular dependencies?

It doesn't have a problem with them, if that's what you mean.

2. If I register dependencies between column, is there a way
to retrieve all table/column type dependencies for a depender column?

You can scan pg_depend.

What I would like to achieve is to lift the limit that
a GENERATED column cannot reference another one.

I would counsel not doing that, mainly because then you will have to
solve an evaluation-order problem at runtime.

Point taken. So, just like with SET / DROP IDENTITY,
I should implement SET GENERATED ALWAYS
and DROP GENERATED.

If you think of it as a property of the default expression, then DROP
DEFAULT covers both cases, you don't need DROP GENERATED...

regards, tom lane

#20Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#19)
Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Tom Lane �rta:

Zoltan Boszormenyi <zb@cybertec.at> writes:

I have two questions about the dependency system.

1. Is there a built-in defense to avoid circular dependencies?

It doesn't have a problem with them, if that's what you mean.

2. If I register dependencies between column, is there a way
to retrieve all table/column type dependencies for a depender column?

You can scan pg_depend.

What I would like to achieve is to lift the limit that
a GENERATED column cannot reference another one.

I would counsel not doing that, mainly because then you will have to
solve an evaluation-order problem at runtime.

OK.

Point taken. So, just like with SET / DROP IDENTITY,
I should implement SET GENERATED ALWAYS
and DROP GENERATED.

If you think of it as a property of the default expression, then DROP
DEFAULT covers both cases, you don't need DROP GENERATED...

So, I should allow DROP DEFAULT, implement
SET DEFAULT GENERATED ALWAYS AS
and modify the catalog so the GENERATED property
is part of pg_attrdef. What about IDENTITY?
Should it also be part of pg_attrdef? There are two ways
to implement it: have or don't have a notion of it.
The latter would treat GENERATED BY DEFAULT AS IDENTITY
the same as SERIAL.

regards, tom lane

--
----------------------------------
Zolt�n B�sz�rm�nyi
Cybertec Geschwinde & Sch�nig GmbH
http://www.postgresql.at/

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Boszormenyi Zoltan (#20)
#22Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Boszormenyi Zoltan (#22)
#24Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#23)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Boszormenyi Zoltan (#24)
#26Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#21)
#27Boszormenyi Zoltan
zb@cybertec.at
In reply to: Boszormenyi Zoltan (#26)
#28Boszormenyi Zoltan
zb@cybertec.at
In reply to: Boszormenyi Zoltan (#27)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Boszormenyi Zoltan (#28)
#30Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#29)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Boszormenyi Zoltan (#30)
#32Florian Pflug
fgp@phlo.org
In reply to: Boszormenyi Zoltan (#30)
#33Andrew Dunstan
andrew@dunslane.net
In reply to: Florian Pflug (#32)
#34Boszormenyi Zoltan
zb@cybertec.at
In reply to: Andrew Dunstan (#33)
#35Boszormenyi Zoltan
zb@cybertec.at
In reply to: Boszormenyi Zoltan (#34)
#36Andrew Dunstan
andrew@dunslane.net
In reply to: Boszormenyi Zoltan (#35)
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#36)
#38Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#37)
#39Boszormenyi Zoltan
zb@cybertec.at
In reply to: Boszormenyi Zoltan (#38)
#40Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#37)
#41Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andrew Dunstan (#40)
#42Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#41)
#43Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#42)
#44Martijn van Oosterhout
kleptog@svana.org
In reply to: Boszormenyi Zoltan (#43)
#45Boszormenyi Zoltan
zb@cybertec.at
In reply to: Martijn van Oosterhout (#44)
#46Andrew Dunstan
andrew@dunslane.net
In reply to: Boszormenyi Zoltan (#45)
#47Boszormenyi Zoltan
zb@cybertec.at
In reply to: Boszormenyi Zoltan (#45)
#48Boszormenyi Zoltan
zb@cybertec.at
In reply to: Andrew Dunstan (#46)
#49Andrew Dunstan
andrew@dunslane.net
In reply to: Boszormenyi Zoltan (#47)
#50Boszormenyi Zoltan
zb@cybertec.at
In reply to: Andrew Dunstan (#49)
#51Tom Lane
tgl@sss.pgh.pa.us
In reply to: Boszormenyi Zoltan (#50)
#52Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#51)
#53Tom Lane
tgl@sss.pgh.pa.us
In reply to: Boszormenyi Zoltan (#52)
#54Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#53)
#55Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#54)
#56Boszormenyi Zoltan
zb@cybertec.at
In reply to: Tom Lane (#55)
#57Boszormenyi Zoltan
zb@cybertec.at
In reply to: Boszormenyi Zoltan (#56)
#58Boszormenyi Zoltan
zb@cybertec.at
In reply to: Boszormenyi Zoltan (#57)
#59Boszormenyi Zoltan
zb@cybertec.at
In reply to: Boszormenyi Zoltan (#58)
#60Bruce Momjian
bruce@momjian.us
In reply to: Boszormenyi Zoltan (#59)
#61Boszormenyi Zoltan
zb@cybertec.at
In reply to: Bruce Momjian (#60)
#62Bruce Momjian
bruce@momjian.us
In reply to: Boszormenyi Zoltan (#61)