Sequences....
Hello hackers...
I've spent the last couple of evening tracing through the drop table/sequence
code trying to figure out the best to drop the sequence when the table is
dropped.
Here is what I am proposing to do. I just wanted to throw out my idea and get
some feedback since I am just beginning to understand how the backend works.
Take the following example:
CREATE TABLE foo (i SERIAL, t text);
This creates table foo, index foo_i_key, and the sequence foo_i_seq.
The sequence ocuppies three of the system tables: pg_class, pg_attribute, and
pg_attrdef. When the table gets dropped, the table foo and foo_i_key are
removed. The default portion of the sequence is also removed from the
pg_attrdef system table, because the attrelid matches the table's oid.
I believe this is incorrect ... I think the attrelid should match the seqences
oid instead of the table's oid to prevent the following error:
ryan=> CREATE TABLE foo (i SERIAL, t text);
NOTICE: CREATE TABLE will create implicit sequence foo_i_seq for SERIAL column
foo.i
NOTICE: CREATE TABLE/UNIQUE will create implicit index foo_i_key for table foo
CREATE
ryan=> \d
Database = ryan
+------------------+----------------------------------+----------+
| Owner | Relation | Type |
+------------------+----------------------------------+----------+
| rbrad | foo | table |
| rbrad | foo_i_key | index |
| rbrad | foo_i_seq | sequence |
+------------------+----------------------------------+----------+
ryan=> \d foo;
Table = foo
+----------------------------------+----------------------------------+-------+
| Field | Type | Length|
+----------------------------------+----------------------------------+-------+
| i | int4 not null default nextval('f | 4 |
| t | text | var |
+----------------------------------+----------------------------------+-------+
Index: foo_i_key
ryan=> drop sequence foo_i_seq;
DROP
ryan=> \d
Database = ryan
+------------------+----------------------------------+----------+
| Owner | Relation | Type |
+------------------+----------------------------------+----------+
| rbrad | foo | table |
| rbrad | foo_i_key | index |
+------------------+----------------------------------+----------+
ryan=> \d foo;
Table = foo
+----------------------------------+----------------------------------+-------+
| Field | Type | Length|
+----------------------------------+----------------------------------+-------+
| i | int4 not null default nextval('f | 4 |
| t | text | var |
+----------------------------------+----------------------------------+-------+
Index: foo_i_key
ryan=> insert into foo (t) values ('blah');
ERROR: foo_i_seq.nextval: sequence does not exist
ryan=>
This looks pretty easy to fix.
Back to my origional point .. I think we need another system table to map the
sequence oid to the table's oid. I've noticed this done with the inheritance,
indexes, etc ... but I don't see a pg_sequence table.
I would be glad to try and finish this in the next couple of evenings if this
looks like the correct approach to the problem, otherwise could someone point me
in the right direction :)
Thanks,
-Ryan
Should I try to finish this before 6.5? Or wait for the next release?
-Ryan
Hello hackers...
I've spent the last couple of evening tracing through the drop table/sequence
code trying to figure out the best to drop the sequence when the table is
dropped.Here is what I am proposing to do. I just wanted to throw out my idea and get
some feedback since I am just beginning to understand how the backend works.Take the following example:
CREATE TABLE foo (i SERIAL, t text);This creates table foo, index foo_i_key, and the sequence foo_i_seq.
The sequence ocuppies three of the system tables: pg_class, pg_attribute, and
pg_attrdef. When the table gets dropped, the table foo and foo_i_key are
removed. The default portion of the sequence is also removed from the
pg_attrdef system table, because the attrelid matches the table's oid.I believe this is incorrect ... I think the attrelid should match the seqences
oid instead of the table's oid to prevent the following error:ryan=> CREATE TABLE foo (i SERIAL, t text);
NOTICE: CREATE TABLE will create implicit sequence foo_i_seq for SERIAL
column
foo.i
NOTICE: CREATE TABLE/UNIQUE will create implicit index foo_i_key for table
foo
CREATE
ryan=> \d
Database = ryan +------------------+----------------------------------+----------+ | Owner | Relation | Type | +------------------+----------------------------------+----------+ | rbrad | foo | table | | rbrad | foo_i_key | index | | rbrad | foo_i_seq | sequence | +------------------+----------------------------------+----------+ryan=> \d foo;
Table = foo
+----------------------------------+----------------------------------+-------+
| Field | Type |
Length|
+----------------------------------+----------------------------------+-------+
| i | int4 not null default nextval('f | 4
|
| t | text | var
|
+----------------------------------+----------------------------------+-------+
Index: foo_i_keyryan=> drop sequence foo_i_seq;
DROPryan=> \d
Database = ryan +------------------+----------------------------------+----------+ | Owner | Relation | Type | +------------------+----------------------------------+----------+ | rbrad | foo | table | | rbrad | foo_i_key | index | +------------------+----------------------------------+----------+ ryan=> \d foo;Table = foo
+----------------------------------+----------------------------------+-------+
| Field | Type |
Length|
+----------------------------------+----------------------------------+-------+
| i | int4 not null default nextval('f | 4
|
| t | text | var
|
+----------------------------------+----------------------------------+-------+
Index: foo_i_keyryan=> insert into foo (t) values ('blah');
ERROR: foo_i_seq.nextval: sequence does not existryan=>
This looks pretty easy to fix.
Back to my origional point .. I think we need another system table to map the
sequence oid to the table's oid. I've noticed this done with the inheritance,
indexes, etc ... but I don't see a pg_sequence table.I would be glad to try and finish this in the next couple of evenings if this
looks like the correct approach to the problem, otherwise could someone point
me
Show quoted text
in the right direction :)
Thanks,
-Ryan
Import Notes
Resolved by subject fallback
Ryan Bradetich wrote:
Back to my origional point .. I think we need another system table to map the
sequence oid to the table's oid. I've noticed this done with the inheritance,
indexes, etc ... but I don't see a pg_sequence table.
Sounds good.
As long as a sequence can point to more than
one table/column combination.
Or, I guess, you can have the relationship 1-1 for
the SERIAL type, but this should not prevent using
sequences across more than one table if you don't use SERIAL.
I often use a sequence for 3 or more tables in a system
so that I can use 'generic' functions on the tables
and produce reports without conficting primary keys..
:) Clark
Ryan Bradetich wrote:
Back to my origional point .. I think we need another system table to map
the
sequence oid to the table's oid. I've noticed this done with the
inheritance,
indexes, etc ... but I don't see a pg_sequence table.
Sounds good.
As long as a sequence can point to more than
one table/column combination.Or, I guess, you can have the relationship 1-1 for
the SERIAL type, but this should not prevent using
sequences across more than one table if you don't use SERIAL.
I often use a sequence for 3 or more tables in a system
so that I can use 'generic' functions on the tables
and produce reports without conficting primary keys..:) Clark
Hmm.. Good points.... I'll make sure that doesn't happen. Thanks for the tips.
:)
-Ryan
Import Notes
Resolved by subject fallback
As long as a sequence can point to more than
one table/column combination.
Doesn't seem like a problem to me --- as far as I understood Ryan,
the new table he's proposing would only contain entries for sequences
created to implement SERIAL keywords. For those, I think there should
indeed be a 1-1 mapping between parent table (+column) and resulting
sequence.
But yeah, don't break usage of ordinary standalone sequences ;-).
Another thing to think about is what it's going to take to dump and
reload this structure in pg_dump. We need to be able to reconstitute
the system tables' contents and the current value of the SERIAL sequence
after a reload.
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofMon15Mar1999180618-0700199903160106.SAA10356@hpb50023.boi.hp.com | Resolved by subject fallback
BTW, has anyone thought twice about the interaction of SERIAL columns
with inheritance? If I create a table having a SERIAL column and then
create a child table that inherits from the first, what happens? Does
the child share the use of the parent's sequence (implying that serial
number assignments are unique across the parent and all its children)?
Or does the child get a new sequence object of its very own --- and if
so, what does that sequence object start out at?
We ought to find out what the current code actually does and then think
about whether we like it or not; I'll bet that the current behavior was
not designed but just fell out of the implementation.
If we do want shared use of a parent's sequence, that's going to
complicate Ryan's new system table considerably --- probably it needs
to have a row for each table using a particular sequence-created-to-
implement-SERIAL, and the sequence object can be deleted only when the
last reference to it goes away. Life may become even more interesting
for pg_dump, too.
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofMon15Mar1999222328-050014901.921554608@sss.pgh.pa.us | Resolved by subject fallback
Ryan Bradetich wrote:
Take the following example:
CREATE TABLE foo (i SERIAL, t text);This creates table foo, index foo_i_key, and the sequence foo_i_seq.
The sequence ocuppies three of the system tables: pg_class, pg_attribute, and
pg_attrdef. When the table gets dropped, the table foo and foo_i_key are
removed. The default portion of the sequence is also removed from the
pg_attrdef system table, because the attrelid matches the table's oid.I believe this is incorrect ... I think the attrelid should match the seqences
oid instead of the table's oid to prevent the following error:
pg_attrdef->attrelid is used to store DEFAULT definition
for particular attribute -> DEFAULT part of SERIAL definition
will not work after this...
Back to my origional point .. I think we need another system table to map the
sequence oid to the table's oid. I've noticed this done with the inheritance,
indexes, etc ... but I don't see a pg_sequence table.
Sequences and tables are independent things - that's why
there was no pg_sequence table. Currently each sequence
has row in pg_class, i.e. sequences are special tables.
But I agreed that we need in new table to reflect
SERIAL <--> sequence dependencies.
Vadim
BTW, has anyone thought twice about the interaction of SERIAL columns
with inheritance? If I create a table having a SERIAL column and then
create a child table that inherits from the first, what happens? Does
the child share the use of the parent's sequence (implying that serial
number assignments are unique across the parent and all its children)?
Or does the child get a new sequence object of its very own --- and if
so, what does that sequence object start out at?We ought to find out what the current code actually does and then think
about whether we like it or not; I'll bet that the current behavior was
not designed but just fell out of the implementation.
Currently the parent and child share the same sequence.
ryan=> CREATE TABLE parent (i SERIAL);
NOTICE: CREATE TABLE will create implicit sequence parent_i_seq for SERIAL column parent.i
NOTICE: CREATE TABLE/UNIQUE will create implicit index parent_i_key for table parent
CREATE
ryan=> CREATE TABLE child (t text) INHERITS (parent);
CREATE
ryan=> INSERT INTO parent VALUES (NEXTVAL('parent_i_seq'));
INSERT 18731 1
ryan=> INSERT INTO child (t) values ('test');
INSERT 18732 1
ryan=> INSERT INTO parent VALUES (NEXTVAL('parent_i_seq'));
INSERT 18733 1
ryan=> SELECT * FROM parent;
i
-
1
3
(2 rows)
ryan=> SELECT * FROM child;
i|t
-+----
2|test
(1 row)
ryan=>
If we do want shared use of a parent's sequence, that's going to
complicate Ryan's new system table considerably --- probably it needs
to have a row for each table using a particular sequence-created-to-
implement-SERIAL, and the sequence object can be deleted only when the
last reference to it goes away. Life may become even more interesting
for pg_dump, too.regards, tom lane
I'm not sure about the pg_dump, but I do not see the need to added complexity to the system table because of the shared sequence. The
parent table can not be dropped while the child table exists, which would be the last reference to the serial-sequence.
ryan=> drop table parent;
ERROR: Relation '18718' inherits 'parent'
ryan=>
This is the behavior I would expect, but then again I'm pretty new to databases ... and know nothing about the standard :)
-Ryan
P.S. I hope to finish the patch tonight for the system table, but I will probably need some help/input on the pg_dump issues.
Import Notes
Resolved by subject fallback
I've finished that patch for the serial->sequences but I am not happy with it.
I don't think belive my proposed idea is the correct approach. I still want to work on this
idea, but I'm not sure how to proceed.
Here are my thoughts:
1. If I use my proposed idea, everything works great until you dump/reload the table. The dump
doesn't store the data type as serial, but as int with a default and a sequence. So when the
table gets reloaded, the relationship nolonger exists. (I think I finally understand the dump
issues now Tom :))
2. Tom suggested reference counting the seqence in one of his email's (I know it was for this
purpose, but I still liked the idea). I thought about this for a while, and concluded that this
is probably not the correct solution since the sequence can be accessed from something else
besides the default values.
3. Vadim pointed out that sequences and tables are really seperate entities and now that I
understand them better, I agree with him. The serial type is not really a type, but a shortcut
to create a int4 data type, a key, and a sequence in one command.
I have two current thoughts on how to proceed from here... I want to toss them out, get some
more feedback ... rediscover that I haven't really thought them out well enought and do this all
over again :)
1. Leave it as it is now. It works, just explain to people that sequences and tables are
seperate entities, and the serial type is just a shortcut.
2. Create a new data type serial. I haven't thought idea out much, and need to investigate it
some more. I'm thinking it would be binary equivilent with the int4 type, and use most of the
existing seqence code, but not put the seqence name entry in the pg_class system table. Does
this sound like a feasible idea?
Thanks for your input.
-Ryan
Import Notes
Resolved by subject fallback
Wow. Serious effort here.
Ryan Bradetich wrote:
1. Leave it as it is now. It works, just explain to
people that sequences and tables are seperate entities,
and the serial type is just a shortcut.
I dislike this approach. It seems that it is hiding detail
that is necessary for proper maintence. It isn't that
hard to type in the code. IMHO, the shortcut causes more
confusion that it helps. So, I propose a third option:
0. Remove the SERIAL construct.
2. Create a new data type serial. I haven't thought idea
out much, and need to investigate it some more. I'm thinking
it would be binary equivilent with the int4 type, and use
most of the existing seqence code, but not put the seqence
name entry in the pg_class system table. Does this sound
like a feasible idea?
This dosn't sound all that bad... but I really
wonder what the advantage is.
I vote for "0". Sorry to dissappoint.
Clark
Ryan Bradetich wrote:
I've finished that patch for the serial->sequences but I am not happy with it.
I don't think belive my proposed idea is the correct approach. I still want to work on this
idea, but I'm not sure how to proceed.Here are my thoughts:
1. If I use my proposed idea, everything works great until you dump/reload the table. The dump
doesn't store the data type as serial, but as int with a default and a sequence. So when the
table gets reloaded, the relationship nolonger exists. (I think I finally understand the dump
issues now Tom :))
Add attnum - attribute number of SERIAL column in table -
to new relation: using this pg_dump will know what
columns are SERIAL ones and what are not...
Vadim
Thus spake Ryan Bradetich
2. Create a new data type serial. I haven't thought idea out much, and need to investigate it
some more. I'm thinking it would be binary equivilent with the int4 type, and use most of the
existing seqence code, but not put the seqence name entry in the pg_class system table. Does
this sound like a feasible idea?
I like it. I always wondered why serial wasn't a real datatype. I never
wondered out loud for fear of finding myself with a new project. :-)
If we are thinking about it, I wonder if we could enhance it somewhat.
Under the current system, the attribute values and underlying method
are divorced but if we make serial a first class type then we can look
at the values, perhaps, when getting the next number. For example, if
we add a row and specify a number, the system can note that and skip
that number later. That would certainly fix the dump/restore problem.
Alternatively, maybe we can enforce the serialism of the type. Even
if the user specifies a value, ignore it and put the next number in
anyway. Of course, that just brings us back to the dump/restore
problem so...
Do as above but allow the user to specify a number as long as it is
available and is lower than the next number in the series. When
the restore happens, you need to set the start value to the previous
next value then the inserts can restore with all the old values. You
can even safely insert new records while the restore is happening.
If we decide to leave things more or less as they are, how about a new
flag for sequences and indexes that sets a row as system generated
rather than user specified? We can then set that field when a sequence
or index is generated by the system such as for the serial type or
primary keys. Dump could then be written to ignore these rows on
output. That would deal with the current issue.
Just some random thought from someone who woke up too early.
--
D'Arcy J.M. Cain <darcy@{druid|vex}.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 424 2871 (DoD#0082) (eNTP) | what's for dinner.
Thus spake Clark Evans
0. Remove the SERIAL construct.
Ack! No! Once we ship a feature I think we better be very careful
about dropping it.
--
D'Arcy J.M. Cain <darcy@{druid|vex}.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 424 2871 (DoD#0082) (eNTP) | what's for dinner.
Wow. Serious effort here.
Ryan Bradetich wrote:
1. Leave it as it is now. It works, just explain to
people that sequences and tables are seperate entities,
and the serial type is just a shortcut.I dislike this approach. It seems that it is hiding detail
that is necessary for proper maintence. It isn't that
hard to type in the code. IMHO, the shortcut causes more
confusion that it helps. So, I propose a third option:0. Remove the SERIAL construct.
When they create a serial, we tell them:
ltest=> create table testx(x serial);
NOTICE: CREATE TABLE will create implicit sequence testx_x_seq for
SERIAL column testx.x
NOTICE: CREATE TABLE/UNIQUE will create implicit index testx_x_key for
table testx
CREATE
So it is not so terrible to tell them they have to delete it when
finished. We could also add a column to pg_class which tells us this
sequence was auto-created from oid 12, and remove it when we delete a
table. Or, we could name just try to delete a sequence that has the
name of the table with _seq at the end.
--
Bruce Momjian | http://www.op.net/~candle
maillist@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
"D'Arcy" "J.M." Cain <darcy@druid.net> writes:
Thus spake Ryan Bradetich
2. Create a new data type serial. I haven't thought idea out much,
and need to investigate it some more. I'm thinking it would be binary
equivilent with the int4 type, and use most of the existing seqence
code, but not put the seqence name entry in the pg_class system
table. Does this sound like a feasible idea?
I like it.
A binary-equivalent type does seem like a handier way to represent
SERIAL than what we are doing. You still need a way to find the
associated sequence object, though, so a table mapping from
table-and-column to sequence OID is still necessary. (Unless we
were to use the atttypmod field for the column to hold the sequence
object's OID? Seems a tad fragile though, since there's no way to
update an atttypmod field in an existing table.)
I don't like the idea of not putting the sequence name into pg_class.
That would mean that the sequence object is not directly accessible
for housekeeping operations. If you do that, you'd have to invent
all-new ways to do the following:
* currval, setval, nextval (yes there are scenarios where a
direct nextval on the sequence is useful)
* dump and reload the sequence in pg_dump
Alternatively, maybe we can enforce the serialism of the type. Even
if the user specifies a value, ignore it and put the next number in
anyway.
I don't like that at *all*.
Do as above but allow the user to specify a number as long as it is
available and is lower than the next number in the series.
I think better would be that the sequence value is silently forced to
be at least as large as the inserted number, whenever a specific number
is inserted into a SERIAL field. That would ensure we never generate
duplicates, but not require keeping any extra state.
If we decide to leave things more or less as they are, how about a new
flag for sequences and indexes that sets a row as system generated
rather than user specified? We can then set that field when a sequence
or index is generated by the system such as for the serial type or
primary keys.
Yes, it'd be nice to think about fixing up primary-key implicit indexes
while we are at it --- they have some of the same problems as SERIAL ...
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofWed17Mar1999050300-0500m10NDAC-0000cKC@druid.net | Resolved by subject fallback
"D'Arcy" "J.M." Cain <darcy@druid.net> writes:
Thus spake Ryan Bradetich
2. Create a new data type serial. I haven't thought idea out much,
and need to investigate it some more. I'm thinking it would be binary
equivilent with the int4 type, and use most of the existing seqence
code, but not put the seqence name entry in the pg_class system
table. Does this sound like a feasible idea?I like it.
A binary-equivalent type does seem like a handier way to represent
SERIAL than what we are doing. You still need a way to find the
associated sequence object, though, so a table mapping from
table-and-column to sequence OID is still necessary. (Unless we
were to use the atttypmod field for the column to hold the sequence
object's OID? Seems a tad fragile though, since there's no way to
update an atttypmod field in an existing table.)
atttypmod seems like a perfect idea. We also need a unique type for
large objects, so oid's and large objects can be distinguished. We
could do both at the same time, and with Thomas's new type coersion
stuff, we don't need to create tons of functions for each new type.
I don't like the idea of not putting the sequence name into pg_class.
That would mean that the sequence object is not directly accessible
for housekeeping operations. If you do that, you'd have to invent
all-new ways to do the following:
* currval, setval, nextval (yes there are scenarios where a
direct nextval on the sequence is useful)
* dump and reload the sequence in pg_dump
Yes, let's keep it in pg_class. No reason not to.
If we decide to leave things more or less as they are, how about a new
flag for sequences and indexes that sets a row as system generated
rather than user specified? We can then set that field when a sequence
or index is generated by the system such as for the serial type or
primary keys.Yes, it'd be nice to think about fixing up primary-key implicit indexes
while we are at it --- they have some of the same problems as SERIAL ...
My guess is that 6.5 is too close to be making such sweeping changes,
though the pg_dump problems with SERIAL certainly make this an important
issue.
--
Bruce Momjian | http://www.op.net/~candle
maillist@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Sorry for the igorance, but I'm not quite
understanding. Assuming a new SERIAL type
is made. What would be the difference
between the new type and an OID?
Thanks!
Clark
Bruce Momjian wrote:
It would only mark the the column as an OID that is used for Serial.
The same thing with large objects, so it is an oid, and used for large
objects. It allows pg_dump and other programs to understand the use of
the oid.
So, the serial column and the OID column would be one and
the same? Why is there a sequence problem then?
Clark
Import Notes
Reference msg id not found: 199903172349.SAA21343@candle.pha.pa.us | Resolved by subject fallback
Sorry for the igorance, but I'm not quite
understanding. Assuming a new SERIAL type
is made. What would be the difference
between the new type and an OID?
It would only mark the the column as an OID that is used for Serial.
The same thing with large objects, so it is an oid, and used for large
objects. It allows pg_dump and other programs to understand the use of
the oid.
--
Bruce Momjian | http://www.op.net/~candle
maillist@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian wrote:
It would only mark the the column as an OID that is used for Serial.
The same thing with large objects, so it is an oid, and used for large
objects. It allows pg_dump and other programs to understand the use of
the oid.So, the serial column and the OID column would be one and
the same? Why is there a sequence problem then?
It just marks the oid column as being a serial column. We also have
atttypmod, and could easily use that for marking oid's used for serial,
and those used for large objects. Would be nice.
--
Bruce Momjian | http://www.op.net/~candle
maillist@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026