Extended SERIAL parsing

Started by Zoltan Boszormenyialmost 20 years ago10 messageshackers

zboszor@dunaweb.hu

almost 20 years ago

Hi,

after some experimentation, I came up with the attached patch,
which implements parsing the following SERIAL types:

SERIAL
SERIAL GENERATED { ALWAYS | BY DEFAULT }
SERIAL GENERATED [ ALWAYS | BY DEFAULT ] AS IDENTITY( sequence options )

The underlying type is still int4 or int8,
so the problems you discussed aren't solved.
But at least the current semantics is kept.

It passes all regression tests, and it works, too:

# create table proba (i serial generated as identity(minvalue 5 maxvalue
10) primary key, t text);
NOTICE: CREATE TABLE will create implicit sequence "proba_i_seq" for
serial column "proba.i"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"proba_pkey" for table "proba"
CREATE TABLE
# insert into proba (t) values ('a');
INSERT 0 1
# select * from proba;
i | t
---+---
5 | a
(1 row)

For now, GENERATED { ALWAYS | BY DEFAULT }
are just fillings.

The condition (column->is_serial && column->force_default)
can help enforcing GENERATED ALWAYS at INSERT time
and can also help fixing the two TODO entries about SERIAL.

Best regards,
Zoltďż˝n Bďż˝szďż˝rmďż˝nyi

Rod Taylor

rbt@rbt.ca

almost 20 years ago

In reply to: Zoltan Boszormenyi (#1)

Re: Extended SERIAL parsing

The condition (column->is_serial && column->force_default)
can help enforcing GENERATED ALWAYS at INSERT time
and can also help fixing the two TODO entries about SERIAL.

You will need to include the insert components of the spec which allow
for overriding GENERATED ALWAYS during an INSERT and extend that to COPY
and teach pg_dump how to use them.
--

Zoltan Boszormenyi

zboszor@dunaweb.hu

almost 20 years ago

In reply to: Rod Taylor (#2)

Re: Extended SERIAL parsing

Rod Taylor ďż˝rta:

The condition (column->is_serial && column->force_default)
can help enforcing GENERATED ALWAYS at INSERT time
and can also help fixing the two TODO entries about SERIAL.

You will need to include the insert components of the spec which allow
for overriding GENERATED ALWAYS during an INSERT and extend that to COPY
and teach pg_dump how to use them.

OK, that's over my head at the moment. :-)
Maybe the wizards here pick up my patch
and complete it. (I hope.)

Best regards,
Zoltďż˝n Bďż˝szďż˝rmďż˝nyi

Tom Lane

tgl@sss.pgh.pa.us

almost 20 years ago

In reply to: Zoltan Boszormenyi (#1)

Re: Extended SERIAL parsing

Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:

after some experimentation, I came up with the attached patch,
which implements parsing the following SERIAL types:

As has been pointed out before, it would be a seriously bad idea to
implement the SQL syntax for identity columns without matching the
SQL semantics for them. That would leave us behind the eight-ball
when we wanted to implement the SQL semantics. Right now we have
a useful but non-standard semantics, and a useful but non-standard
syntax, and those two should stick together.

I'm not too happy with converting SERIAL4 and SERIAL8 into reserved
words, either, as I believe this patch does.

Some other things missing are documentation and pg_dump support.

regards, tom lane

Zoltan Boszormenyi

zboszor@dunaweb.hu

almost 20 years ago

In reply to: Tom Lane (#4)

Re: Extended SERIAL parsing

Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:

after some experimentation, I came up with the attached patch,
which implements parsing the following SERIAL types:

As has been pointed out before, it would be a seriously bad idea to
implement the SQL syntax for identity columns without matching the
SQL semantics for them. That would leave us behind the eight-ball
when we wanted to implement the SQL semantics. Right now we have
a useful but non-standard semantics, and a useful but non-standard
syntax, and those two should stick together.

Well, I read all sections of 5WD-02-Foundation-2003-09.pdf
where "identity" appears, here are the list of changes that will
be needed for an identity column:

- Only one identity column can appear in the column list.
I have to check for this at CREATE, TABLE, ALTER TABLE ADD COLUMN
and ALTER TABLE ALTER COLUMN.
- ALTER TABLE ALTER COLUMN ... RESTART [WITH] or SET
alter the sequence on the column.
- If colname is SERIAL GENERATED ALWAYS, then
only "UPDATE SER colname = default" may occur.

Then there's the DROP default question: should PostgreSQL
allow it or not? What I found about this in the standard is this:
definitions of the DEFAULT clause, the identity column specification
and the generation clause are mutually exclusive, see 11.4.
So, if you cannot specify a DEFAULT for an identity column,
you must not be able to drop it, although this isn't expressed
in the standard, it's just my opinion.

Is there anything else? I haven't found any.
Or I can't read between the lines, which is a skill
that isn't required for reading standards. :-)

I'm not too happy with converting SERIAL4 and SERIAL8 into reserved
words, either, as I believe this patch does.

Not really, only IDENTITY is added to the list of reserved words,
serial/serial4/serial8/bigserial are just type names:

# create table serial8 (serial8 serial8 primary key);
NOTICE: CREATE TABLE will create implicit sequence "serial8_serial8_seq"
for serial column "serial8.serial8"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"serial8_pkey" for table "serial8"
CREATE TABLE

The others (AS, GENERATED) were added to the
non-reserved keyword list.

Some other things missing are documentation and pg_dump support.

I am working on that. The documentation is easier. :-)

Also note, that I misread the generated column syntax as part of the
identity column syntax. So, the parsing should only recognize

SERIAL [ GENERATED { ALWAYS | BY DEFAULT }
AS IDENTITY [ ( sequence_options ) ] ]

which I already fixed here and the sequence_options list
cannot be empty as with my previous attempt.

Best regards,
Zoltďż˝n Bďż˝szďż˝rmďż˝nyi

Jim Nasby

Jim.Nasby@BlueTreble.com

almost 20 years ago

In reply to: Zoltan Boszormenyi (#5)

Re: Extended SERIAL parsing

On Mon, Jun 12, 2006 at 02:27:31PM +0200, B?sz?rm?nyi Zolt?n wrote:

Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:

after some experimentation, I came up with the attached patch,
which implements parsing the following SERIAL types:

As has been pointed out before, it would be a seriously bad idea to
implement the SQL syntax for identity columns without matching the
SQL semantics for them. That would leave us behind the eight-ball
when we wanted to implement the SQL semantics. Right now we have
a useful but non-standard semantics, and a useful but non-standard
syntax, and those two should stick together.

Well, I read all sections of 5WD-02-Foundation-2003-09.pdf
where "identity" appears, here are the list of changes that will
be needed for an identity column:

Have you read the archives on the recent discussions that have taken
place about whether SERIAL should be a black box or not? IIRC most of
this was all hashed out in that thread.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

Tom Lane

tgl@sss.pgh.pa.us

almost 20 years ago

In reply to: Zoltan Boszormenyi (#5)

Re: Extended SERIAL parsing

=?iso-8859-2?Q?B=F6sz=F6rm=E9nyi_Zolt=E1n?= <zboszor@dunaweb.hu> writes:

Well, I read all sections of 5WD-02-Foundation-2003-09.pdf
where "identity" appears, here are the list of changes that will
be needed for an identity column:

You're missing the hard part: NEXT VALUE FOR is sufficiently different
from nextval() that it will be very painful to implement. Until we have
a way of doing that, I think it would be unwise to use the SQL syntax
for things that don't behave the way the spec says. We might find that
spec-compliant sequences need to be a completely different object type,
or something equally evil. Right now, we have the freedom to do that
if that's what it takes. With the spec syntax already locked down as
generating PG-style sequences, we'd be hosed.

I'm not too happy with converting SERIAL4 and SERIAL8 into reserved
words, either, as I believe this patch does.

Not really, only IDENTITY is added to the list of reserved words,
serial/serial4/serial8/bigserial are just type names:

You apparently haven't thought very hard about the consequences of what
you did to keywords.c. But I'll give you a hint: mapping distinct
strings to the same token is a bad idea.

regards, tom lane

Zoltan Boszormenyi

zboszor@dunaweb.hu

almost 20 years ago

In reply to: Jim Nasby (#6)

Re: Extended SERIAL parsing

Hi,

Jim C. Nasby ï¿½rta:

On Mon, Jun 12, 2006 at 02:27:31PM +0200, B?sz?rm?nyi Zolt?n wrote:

Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:

after some experimentation, I came up with the attached patch,
which implements parsing the following SERIAL types:

As has been pointed out before, it would be a seriously bad idea to
implement the SQL syntax for identity columns without matching the
SQL semantics for them. That would leave us behind the eight-ball
when we wanted to implement the SQL semantics. Right now we have
a useful but non-standard semantics, and a useful but non-standard
syntax, and those two should stick together.

Well, I read all sections of 5WD-02-Foundation-2003-09.pdf
where "identity" appears, here are the list of changes that will
be needed for an identity column:

Have you read the archives on the recent discussions that have taken
place about whether SERIAL should be a black box or not? IIRC most of
this was all hashed out in that thread.

I just read it thoroughly, and the issues I listed wasn't mentioned
in the "black box" thread, at all. I am trying to implement the
standard syntax ( and gradually the conformant behaviour )
along the lines of sections 4.14.7, 11.3, 11.4, 11.7, 11.11,
11.12, 11.17 and 14.8.

Best regards,
Zoltï¿½n Bï¿½szï¿½rmï¿½nyi

Zoltan Boszormenyi

zboszor@dunaweb.hu

almost 20 years ago

In reply to: Tom Lane (#7)

Re: Extended SERIAL parsing

Tom Lane ďż˝rta:

=?iso-8859-2?Q?B=F6sz=F6rm=E9nyi_Zolt=E1n?= <zboszor@dunaweb.hu> writes:

Well, I read all sections of 5WD-02-Foundation-2003-09.pdf
where "identity" appears, here are the list of changes that will
be needed for an identity column:

You're missing the hard part: NEXT VALUE FOR is sufficiently different
from nextval() that it will be very painful to implement. Until we have
a way of doing that, I think it would be unwise to use the SQL syntax
for things that don't behave the way the spec says. We might find that
spec-compliant sequences need to be a completely different object type,
or something equally evil. Right now, we have the freedom to do that
if that's what it takes. With the spec syntax already locked down as
generating PG-style sequences, we'd be hosed.

Do you mean the allowed and denied contexts of the
NEXT VALUE FOR expression in section 6.13?
(As opposed to nextval() which, as being a function
is allowed more broadly.) This part may still be described
with grammar, unless you mean something more suble.

I'm not too happy with converting SERIAL4 and SERIAL8 into reserved
words, either, as I believe this patch does.

Not really, only IDENTITY is added to the list of reserved words,
serial/serial4/serial8/bigserial are just type names:

You apparently haven't thought very hard about the consequences of what
you did to keywords.c. But I'll give you a hint: mapping distinct
strings to the same token is a bad idea.

OK, point taken.

Best regards,
Zoltďż˝n Bďż˝szďż˝rmďż˝nyi

#10

Tom Lane

tgl@sss.pgh.pa.us

almost 20 years ago

In reply to: Zoltan Boszormenyi (#9)

Re: Extended SERIAL parsing

Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:

Tom Lane �rta:

You're missing the hard part: NEXT VALUE FOR is sufficiently different
from nextval() that it will be very painful to implement.

Do you mean the allowed and denied contexts of the
NEXT VALUE FOR expression in section 6.13?

No, I mean that different textual instances of NEXT VALUE FOR are
required to return the same value in certain cases. See the "once per
row" bits in the 6.13 General Rules. It's not real clear what "once per
row" actually means, especially in cases such as references from within
functions invoked by a query --- is that the same query or a different
one? You could make a case for wanting either behavior, particularly
when considering trigger functions. On the whole, it's a mess, and not
particularly well thought out IMHO. But it's in the spec, and if we are
going to adopt the spec's syntax for identity columns then we'd better
provide the spec's behavior.

(As opposed to nextval() which, as being a function
is allowed more broadly.) This part may still be described
with grammar, unless you mean something more suble.

I think trying to enforce the restrictions in 6.13 in the grammar
would be a terrible mistake; better to do it in parse analysis.
Compare the restrictions on where aggregate functions can appear;
we don't try to enforce those grammatically. One reason why not
is that you'd have a hard time getting the grammar to produce anything
more specific than "syntax error", which is pretty unhelpful.

regards, tom lane

Extended SERIAL parsing

Attachments: