Re: Final version of IDENTITY/GENERATED patch

Started by Zoltan Boszormenyiabout 19 years ago62 messageshackers

Jump to latest

Zoltan Boszormenyi

zboszor@dunaweb.hu

about 19 years ago

Resending compressed, it seems pgsql-patches
doesn't let me post so large patches.

Zoltan Boszormenyi ďż˝rta:

Show quoted text

Hi,

I have finished my GENERATED/IDENTITY patch
and now it does everything I wanted it to do. Changes
from the previous version:

- GENERATED columns work now
- extended testcase to test some GENERATED expressions
- extended documentation

Now it comes in an uncompressed context diff form,
out of habit I sent unified diffs before, sorry for that.
It applies to current CVS.

Please, review.

Thanks in advance,
Zoltďż˝n Bďż˝szďż˝rmďż˝nyi

Import Notes

Reply to msg id not found: 45E4BB6A.8040102@dunaweb.huReference msg id not found: 45E4BB6A.8040102@dunaweb.hu

Zoltan Boszormenyi

zboszor@dunaweb.hu

about 19 years ago

In reply to: Zoltan Boszormenyi (#1)

Hi,

I think now this is really the final version.

Changes in this version is:
- when dropping a column that's referenced
by a GENERATED column, the GENERATED
column has to be also dropped. It's required by SQL:2003.
- COPY table FROM works correctly with IDENTITY
and GENERATED columns
- extended testcase to show the above two

To reiterate all the features that accumulated
over time, here's the list:

- extended catalog (pg_attribute) to keep track whether
the column is IDENTITY or GENERATED
- working GENERATED column that may reference
other regular columns; it extends the DEFAULT
infrastructure to allow storing complex expressions;
syntax for such columns:
colname type GENERATED ALWAYS AS ( expression )
- working IDENTITY column whose value is generated
after all other columns (regular or GENERATED)
are assigned with values and validated via their
NOT NULL and CHECK constraints; this allows
tighter numbering - the only case when there may be
missing serials are when UNIQUE indexes are failed
(which is checked on heap_insert() and heap_update()
and is a tougher nut to crack)
syntax is:
colname type GENERATED { ALWAYS | BY DEFAULT }
AS IDENTITY [ ( sequence options ) ]
the original SERIAL pseudo-type is left unmodified, the IDENTITY
concept is new and extends on it - PostgreSQL may have multiple
SERIAL columns in a table, but SQL:2003 requires that at most
one IDENITY column may exist in a table at any time
- Implemented the following TODOs:
- %Have ALTER TABLE RENAME rename SERIAL sequence names
- Allow SERIAL sequences to inherit permissions from the base table?
Actually the roles that have INSERT or UPDATE permissions
on the table gain permission on the sequence, too.
This makes the following TODO unneeded:
- Add DEFAULT .. AS OWNER so permission checks are done as the table owner
This would be useful for SERIAL nextval() calls and CHECK constraints.
- DROP DEFAULT is prohibited on GENERATED and IDENTITY columns
- One SERIAL column can be upgraded to IDENTITY via
ALTER COLUMN column SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
Same for downgrading, via:
ALTER COLUMN column DROP IDENTITY
- COPY and INSERT may use OVERRIDING SYSTEM VALUE
clause to override automatic generation and allow
to import dumped data unmodified
- Update is forbidden for GENERATED ALWAYS AS IDENTITY
columns entirely and for GENERATED ALWAYS AS (expr)
columns for other values than DEFAULT.
- ALTER COLUMN SET <sequence options> for
altering the supporting sequence; works on any
SERIAL-like or IDENTITY columns
- ALTER COLUMN RESTART [WITH] N
for changing only the next generated number in the
sequence.
- The essence of pg_get_serial_sequence() is exported
as get_relid_att_serial_sequence() to be used internally
by checks.
- CHECK constraints cannot reference IDENTITY or
GENERATED columns
- GENERATED columns cannot reference IDENTITY or
GENERATED columns
- dropping a column that's referenced by a GENERATED column
also drops the GENERATED column
- pg_dump dumps correct schema for IDENTITY and
GENERATED columns:
- ALTER COLUMN SET GENERATED ... AS IDENTITY
for IDENTITY columns after ALTER SEQUENCE OWNED BY
- correct GENERATED AS ( expression ) caluse in the table schema
- pg_dump dumps COPY OVERRIDING SYSTEM VALUE
for tables' date that have any GENERATED or
GENERATED ALWAYS AS IDENTITY columns.
- documentation and testcases

Please, review.

Best regards,
Zoltďż˝n Bďż˝szďż˝rmďż˝nyi

Bruce Momjian

bruce@momjian.us

about 19 years ago

In reply to: Zoltan Boszormenyi (#2)

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Zoltan Boszormenyi wrote:

Hi,

I think now this is really the final version.

Changes in this version is:
- when dropping a column that's referenced
by a GENERATED column, the GENERATED
column has to be also dropped. It's required by SQL:2003.
- COPY table FROM works correctly with IDENTITY
and GENERATED columns
- extended testcase to show the above two

To reiterate all the features that accumulated
over time, here's the list:

- extended catalog (pg_attribute) to keep track whether
the column is IDENTITY or GENERATED
- working GENERATED column that may reference
other regular columns; it extends the DEFAULT
infrastructure to allow storing complex expressions;
syntax for such columns:
colname type GENERATED ALWAYS AS ( expression )
- working IDENTITY column whose value is generated
after all other columns (regular or GENERATED)
are assigned with values and validated via their
NOT NULL and CHECK constraints; this allows
tighter numbering - the only case when there may be
missing serials are when UNIQUE indexes are failed
(which is checked on heap_insert() and heap_update()
and is a tougher nut to crack)
syntax is:
colname type GENERATED { ALWAYS | BY DEFAULT }
AS IDENTITY [ ( sequence options ) ]
the original SERIAL pseudo-type is left unmodified, the IDENTITY
concept is new and extends on it - PostgreSQL may have multiple
SERIAL columns in a table, but SQL:2003 requires that at most
one IDENITY column may exist in a table at any time
- Implemented the following TODOs:
- %Have ALTER TABLE RENAME rename SERIAL sequence names
- Allow SERIAL sequences to inherit permissions from the base table?
Actually the roles that have INSERT or UPDATE permissions
on the table gain permission on the sequence, too.
This makes the following TODO unneeded:
- Add DEFAULT .. AS OWNER so permission checks are done as the table owner
This would be useful for SERIAL nextval() calls and CHECK constraints.
- DROP DEFAULT is prohibited on GENERATED and IDENTITY columns
- One SERIAL column can be upgraded to IDENTITY via
ALTER COLUMN column SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
Same for downgrading, via:
ALTER COLUMN column DROP IDENTITY
- COPY and INSERT may use OVERRIDING SYSTEM VALUE
clause to override automatic generation and allow
to import dumped data unmodified
- Update is forbidden for GENERATED ALWAYS AS IDENTITY
columns entirely and for GENERATED ALWAYS AS (expr)
columns for other values than DEFAULT.
- ALTER COLUMN SET <sequence options> for
altering the supporting sequence; works on any
SERIAL-like or IDENTITY columns
- ALTER COLUMN RESTART [WITH] N
for changing only the next generated number in the
sequence.
- The essence of pg_get_serial_sequence() is exported
as get_relid_att_serial_sequence() to be used internally
by checks.
- CHECK constraints cannot reference IDENTITY or
GENERATED columns
- GENERATED columns cannot reference IDENTITY or
GENERATED columns
- dropping a column that's referenced by a GENERATED column
also drops the GENERATED column
- pg_dump dumps correct schema for IDENTITY and
GENERATED columns:
- ALTER COLUMN SET GENERATED ... AS IDENTITY
for IDENTITY columns after ALTER SEQUENCE OWNED BY
- correct GENERATED AS ( expression ) caluse in the table schema
- pg_dump dumps COPY OVERRIDING SYSTEM VALUE
for tables' date that have any GENERATED or
GENERATED ALWAYS AS IDENTITY columns.
- documentation and testcases

Please, review.

Best regards,
Zolt?n B?sz?rm?nyi

[ application/x-tar is not supported, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Zoltan Boszormenyi

zboszor@dunaweb.hu

about 19 years ago

In reply to: Bruce Momjian (#3)

Hi!

Thanks.

However, in the meantime I made some changes
so the IDENTITY column only advances its sequence
if it fails its CHECK constraints or UNIQUE indexes.
I still have some work with expression indexes.
Should I post an incremental patch against this version
or a full patch when it's ready?
An incremental patch can still be posted when the feature
is agreed to be in 8.3 and actually applied. It only changes
some details in the new feature and doesn't change
behaviour of existing features.

Best regards,
Zoltï¿½n Bï¿½szï¿½rmï¿½nyi

Bruce Momjian ï¿½rta:

Show quoted text

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Zoltan Boszormenyi wrote:

Hi,

I think now this is really the final version.

Changes in this version is:
- when dropping a column that's referenced
by a GENERATED column, the GENERATED
column has to be also dropped. It's required by SQL:2003.
- COPY table FROM works correctly with IDENTITY
and GENERATED columns
- extended testcase to show the above two

To reiterate all the features that accumulated
over time, here's the list:

- extended catalog (pg_attribute) to keep track whether
the column is IDENTITY or GENERATED
- working GENERATED column that may reference
other regular columns; it extends the DEFAULT
infrastructure to allow storing complex expressions;
syntax for such columns:
colname type GENERATED ALWAYS AS ( expression )
- working IDENTITY column whose value is generated
after all other columns (regular or GENERATED)
are assigned with values and validated via their
NOT NULL and CHECK constraints; this allows
tighter numbering - the only case when there may be
missing serials are when UNIQUE indexes are failed
(which is checked on heap_insert() and heap_update()
and is a tougher nut to crack)
syntax is:
colname type GENERATED { ALWAYS | BY DEFAULT }
AS IDENTITY [ ( sequence options ) ]
the original SERIAL pseudo-type is left unmodified, the IDENTITY
concept is new and extends on it - PostgreSQL may have multiple
SERIAL columns in a table, but SQL:2003 requires that at most
one IDENITY column may exist in a table at any time
- Implemented the following TODOs:
- %Have ALTER TABLE RENAME rename SERIAL sequence names
- Allow SERIAL sequences to inherit permissions from the base table?
Actually the roles that have INSERT or UPDATE permissions
on the table gain permission on the sequence, too.
This makes the following TODO unneeded:
- Add DEFAULT .. AS OWNER so permission checks are done as the table owner
This would be useful for SERIAL nextval() calls and CHECK constraints.
- DROP DEFAULT is prohibited on GENERATED and IDENTITY columns
- One SERIAL column can be upgraded to IDENTITY via
ALTER COLUMN column SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY
Same for downgrading, via:
ALTER COLUMN column DROP IDENTITY
- COPY and INSERT may use OVERRIDING SYSTEM VALUE
clause to override automatic generation and allow
to import dumped data unmodified
- Update is forbidden for GENERATED ALWAYS AS IDENTITY
columns entirely and for GENERATED ALWAYS AS (expr)
columns for other values than DEFAULT.
- ALTER COLUMN SET <sequence options> for
altering the supporting sequence; works on any
SERIAL-like or IDENTITY columns
- ALTER COLUMN RESTART [WITH] N
for changing only the next generated number in the
sequence.
- The essence of pg_get_serial_sequence() is exported
as get_relid_att_serial_sequence() to be used internally
by checks.
- CHECK constraints cannot reference IDENTITY or
GENERATED columns
- GENERATED columns cannot reference IDENTITY or
GENERATED columns
- dropping a column that's referenced by a GENERATED column
also drops the GENERATED column
- pg_dump dumps correct schema for IDENTITY and
GENERATED columns:
- ALTER COLUMN SET GENERATED ... AS IDENTITY
for IDENTITY columns after ALTER SEQUENCE OWNED BY
- correct GENERATED AS ( expression ) caluse in the table schema
- pg_dump dumps COPY OVERRIDING SYSTEM VALUE
for tables' date that have any GENERATED or
GENERATED ALWAYS AS IDENTITY columns.
- documentation and testcases

Please, review.

Best regards,
Zolt?n B?sz?rm?nyi

[ application/x-tar is not supported, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Bruce Momjian

bruce@momjian.us

about 19 years ago

In reply to: Zoltan Boszormenyi (#4)

Zoltan Boszormenyi wrote:

Hi!

Thanks.

However, in the meantime I made some changes
so the IDENTITY column only advances its sequence
if it fails its CHECK constraints or UNIQUE indexes.
I still have some work with expression indexes.
Should I post an incremental patch against this version
or a full patch when it's ready?

Full patch.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Zoltan Boszormenyi

zboszor@dunaweb.hu

about 19 years ago

In reply to: Bruce Momjian (#5)

IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Bruce Momjian ï¿½rta:

Zoltan Boszormenyi wrote:

Hi!

Thanks.

However, in the meantime I made some changes
so the IDENTITY column only advances its sequence
if it fails its CHECK constraints or UNIQUE indexes.
I still have some work with expression indexes.
Should I post an incremental patch against this version
or a full patch when it's ready?

Full patch.

Then here it is. Now it's really finished, I promise. :-)
Changes:

- unique index checks are done in two steps
to avoid inflating the sequence if a unique index check
is failed that doesn't reference the IDENTITY column
- to minimize runtime impact of checking whether
an index references the IDENTITY column and skipping it
in the first step in ExecInsertIndexTuples(), I introduced
a new attribute in the pg_index catalog. I had to place it
in the middle of the fixed size attributes because I had
mysterious crashes otherwise. This means the attributes
are renumbered. This attribute is determined during
CREATE INDEX and recomputed for all indexes defined
on the table during ALTER TABLE SET/DROP IDENTITY.
- as a consequence, IDENTITY/GENERATED can now
have CHECK constraints, this limit was removed.
- modified testcase for the above changes
- reworded documentation

Please, review.

Best regards,
Zoltï¿½n Bï¿½szï¿½rmï¿½nyi

Bruce Momjian

bruce@momjian.us

about 19 years ago

In reply to: Zoltan Boszormenyi (#6)

Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Zoltan Boszormenyi wrote:

Hi

Bruce Momjian ?rta:

Zoltan Boszormenyi wrote:

Hi!

Thanks.

However, in the meantime I made some changes
so the IDENTITY column only advances its sequence
if it fails its CHECK constraints or UNIQUE indexes.
I still have some work with expression indexes.
Should I post an incremental patch against this version
or a full patch when it's ready?

Full patch.

Then here it is. Now it's really finished, I promise. :-)
Changes:

- unique index checks are done in two steps
to avoid inflating the sequence if a unique index check
is failed that doesn't reference the IDENTITY column
- to minimize runtime impact of checking whether
an index references the IDENTITY column and skipping it
in the first step in ExecInsertIndexTuples(), I introduced
a new attribute in the pg_index catalog. I had to place it
in the middle of the fixed size attributes because I had
mysterious crashes otherwise. This means the attributes
are renumbered. This attribute is determined during
CREATE INDEX and recomputed for all indexes defined
on the table during ALTER TABLE SET/DROP IDENTITY.
- as a consequence, IDENTITY/GENERATED can now
have CHECK constraints, this limit was removed.
- modified testcase for the above changes
- reworded documentation

Please, review.

Best regards,
Zolt?n B?sz?rm?nyi

[ application/x-tar is not supported, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Tom Lane

tgl@sss.pgh.pa.us

about 19 years ago

In reply to: Zoltan Boszormenyi (#6)

Re: IDENTITY/GENERATED v36 Re: Final version of IDENTITY/GENERATED patch

Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:

[ IDENTITY/GENERATED patch ]

I got around to reviewing this today.

- unique index checks are done in two steps
to avoid inflating the sequence if a unique index check
is failed that doesn't reference the IDENTITY column

This is just not acceptable --- there is nothing in the standard that
requires such behavior, and I dislike the wide-ranging kluges you
introduced to support it. Please get rid of that and put the behavior
back into ordinary DEFAULT-substitution where it belongs.

- to minimize runtime impact of checking whether
an index references the IDENTITY column and skipping it
in the first step in ExecInsertIndexTuples(), I introduced
a new attribute in the pg_index catalog.

This is likewise unreasonably complex and fragile ... but it
goes away anyway if you remove the above, no?

The patch appears to believe that OVERRIDING SYSTEM VALUE should be
restricted to the table owner, but I don't actually see any support
for that in the SQL2003 spec ... where did you get that from?

I'm pretty dubious about the kluges in aclchk.c to automatically
grant/revoke on dependent sequences --- particularly the "revoke"
part. The problem with that is that it breaks if the same sequence
is being used to feed multiple tables.

User-facing errors need to be ereport() not elog() so that they
can be translated and have appropriate SQLSTATEs reported.
elog is only for internal errors.

One other thought is that the field names based on force_default
seemed confusing. I'd suggest that maybe "generated" would be
a better name choice.

Please fix and resubmit.

regards, tom lane

Re: Final version of IDENTITY/GENERATED patch

Attachments:

Attachments:

Attachments:

Attachments: