OID wraparound: summary and proposal

Started by Tom Laneover 24 years ago74 messages
#1Tom Lane
tgl@sss.pgh.pa.us

Given Hiroshi's objections, and the likelihood of compatibility problems
for existing applications, I am now thinking that it's not a good idea to
turn off OID generation by default. (At least not for 7.2 --- maybe in
some future release we could change the default.)

Based on the discussion so far, here is an attempt to flesh out the
details of what to do with OIDs for 7.2:

1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE.
The default behavior will be WITH OIDS.

Note: there was some discussion of a GUC variable to control the default.
I'm leaning against this, mainly because having one would mean that
pg_dump *must* write WITH OIDS or WITHOUT OIDS in every CREATE TABLE;
else it couldn't be sure that the database schema would be correctly
reconstructed. That would create dump-script portability problems and
negate some of the point of having a GUC variable in the first place.
So I'm thinking a fixed default is better.

Note: an alternative syntax possibility is to make it look like the "with"
option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)".
This is uglier today, but would start to look more attractive if we invent
additional CREATE TABLE options in the future --- there'd be a place to
put 'em. Comments?

2. A child table will be forced to have OIDs if any of its parents do,
even if WITHOUT OIDS is specified in the child's CREATE command. This is
on the theory that the OID ought to act like an inherited column.

3. For a table without OIDs, no entry will be made in pg_attribute for
the OID column, so an attempt to reference the OID column will draw a
"no such column" error. (An alternative is to allow OID to read as nulls,
but it seemed that people preferred the error to be raised.)

4. When inserting into an OID-less table, the INSERT result string will
always show 0 for the OID.

5. A "relhasoids" boolean column will be added to pg_class to signal
whether a table has OIDs or not.

6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the
table has no OIDs. (Alternative possibility: raise an error --- is that
better?) COPY in WITH OIDS will silently drop the incoming OID values.

7. Physical tuple headers won't change. If no OIDs are assigned for a
particular table, the OID field in the header will be left zero.

8. OID generation will be disabled for those system tables that don't need
it --- pg_listener, pg_largeobject, and pg_attribute being some major
offenders that consume lots of OIDs.

9. To continue to support COMMENT ON COLUMN when columns have no OIDs,
pg_description will be modified so that its primary key is (object type,
object OID, column number) --- this also solves the problem that comments
break if there are duplicate OIDs in different system tables. The object
type is the OID of the system catalog in which the object OID appears.
The column number field will be zero for all object types except columns.
For a column comment, the object type and OID fields will refer to the
parent table, and column number will be nonzero.

10. pg_dump will be modified to do the appropriate things with OIDs. Are
there any other application programs that need to change?

We had also talked about adding an INSERT ... RETURNING feature to allow
applications to eliminate their dependence on looking at the OID returned
by an INSERT command. I think this is a good idea, but there are still
a number of unsolved issues about how it should interact with rules.
Accordingly, I'm not going to try to include it in this batch of work.

Comments?

regards, tom lane

#2Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Tom Lane (#1)
Re: OID wraparound: summary and proposal

On Wed, 1 Aug 2001, Tom Lane wrote:

Based on the discussion so far, here is an attempt to flesh out the
details of what to do with OIDs for 7.2:

1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE.
The default behavior will be WITH OIDS.

Note: there was some discussion of a GUC variable to control the default.

Note: an alternative syntax possibility is to make it look like the "with"
option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)".
This is uglier today, but would start to look more attractive if we invent
additional CREATE TABLE options in the future --- there'd be a place to
put 'em. Comments?

I think a fixed default and placing it in parentheses are probably good
ideas.

3. For a table without OIDs, no entry will be made in pg_attribute for
the OID column, so an attempt to reference the OID column will draw a
"no such column" error. (An alternative is to allow OID to read as nulls,
but it seemed that people preferred the error to be raised.)

Okay, at least the foreign key stuff will need to change (since it does a
select oid), but I don't think it ever does anything with that except
check for existance, so I could probably make it select 1 as reasonable
replacement.

#3Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#1)
Re: OID wraparound: summary and proposal

Given Hiroshi's objections, and the likelihood of compatibility problems
for existing applications, I am now thinking that it's not a good idea to
turn off OID generation by default. (At least not for 7.2 --- maybe in
some future release we could change the default.)

This seems good. People with oid concerns usually have 1-2 huge tables
and the rest are small.

Based on the discussion so far, here is an attempt to flesh out the
details of what to do with OIDs for 7.2:

1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE.
The default behavior will be WITH OIDS.

Makes sense.

Note: there was some discussion of a GUC variable to control the default.
I'm leaning against this, mainly because having one would mean that
pg_dump *must* write WITH OIDS or WITHOUT OIDS in every CREATE TABLE;
else it couldn't be sure that the database schema would be correctly
reconstructed. That would create dump-script portability problems and
negate some of the point of having a GUC variable in the first place.
So I'm thinking a fixed default is better.

Good point.

Note: an alternative syntax possibility is to make it look like the "with"
option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)".
This is uglier today, but would start to look more attractive if we invent
additional CREATE TABLE options in the future --- there'd be a place to
put 'em. Comments?

I don't like the parens. Looks ugly and I am not used to seeing them
used that way. I can imagine later using WITH NOOIDS, NOBIBBLE, BABBLE.
Maybe the syntax should be WITH OID, WITH NOOID?

2. A child table will be forced to have OIDs if any of its parents do,
even if WITHOUT OIDS is specified in the child's CREATE command. This is
on the theory that the OID ought to act like an inherited column.

Good point.

3. For a table without OIDs, no entry will be made in pg_attribute for
the OID column, so an attempt to reference the OID column will draw a
"no such column" error. (An alternative is to allow OID to read as nulls,
but it seemed that people preferred the error to be raised.)

Makes sense.

6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the
table has no OIDs. (Alternative possibility: raise an error --- is that
better?) COPY in WITH OIDS will silently drop the incoming OID values.

Obviously, the case here is that COPY WITH OIDS alone on a non-oid table
should throw an error, while pg_dump -o should work on a database with
mixed oid/non-oid. I think the right thing would be to have pg_dump
check pg_class.relhasoids and issue a proper COPY statement to match the
existing table.

7. Physical tuple headers won't change. If no OIDs are assigned for a
particular table, the OID field in the header will be left zero.

8. OID generation will be disabled for those system tables that don't need
it --- pg_listener, pg_largeobject, and pg_attribute being some major
offenders that consume lots of OIDs.

9. To continue to support COMMENT ON COLUMN when columns have no OIDs,
pg_description will be modified so that its primary key is (object type,
object OID, column number) --- this also solves the problem that comments
break if there are duplicate OIDs in different system tables. The object
type is the OID of the system catalog in which the object OID appears.
The column number field will be zero for all object types except columns.
For a column comment, the object type and OID fields will refer to the
parent table, and column number will be nonzero.

Sounds like a hack. I still prefer pg_attribute to have oids. Can we
have temp tables have no pg_attribute oids? A hack on a hack?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#3)
Re: OID wraparound: summary and proposal

Bruce Momjian <pgman@candle.pha.pa.us> writes:

6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the
table has no OIDs. (Alternative possibility: raise an error --- is that
better?) COPY in WITH OIDS will silently drop the incoming OID values.

Obviously, the case here is that COPY WITH OIDS alone on a non-oid table
should throw an error, while pg_dump -o should work on a database with
mixed oid/non-oid. I think the right thing would be to have pg_dump
check pg_class.relhasoids and issue a proper COPY statement to match the
existing table.

pg_dump clearly will need to do that, so it isn't really going to be the
issue. The question is what to do when a less-clueful app issues a COPY
WITH OIDS on an OID-less table. For input, I see no downside to just
ignoring the incoming OIDs. For output, I can see three reasonable
possibilities:

A. Pretend WITH OIDS wasn't mentioned. This might seem to be
"do the right thing", but a rather strong objection is that the
app will not get back the data it was expecting.

B. Return NULLs or 0s for the OIDs column.

C. Raise an error and refuse to do the copy at all.

C is probably the most conservative answer.

9. To continue to support COMMENT ON COLUMN when columns have no OIDs,
pg_description will be modified so that its primary key is (object type,
object OID, column number) --- this also solves the problem that comments
break if there are duplicate OIDs in different system tables. The object
type is the OID of the system catalog in which the object OID appears.
The column number field will be zero for all object types except columns.
For a column comment, the object type and OID fields will refer to the
parent table, and column number will be nonzero.

Sounds like a hack.

How so? pg_description is broken anyway given that we don't enforce OID
uniqueness across system catalogs. Also, in the future we could
consider overloading the <column number> column to have meanings for
other object types. I could imagine using it to attach documentation to
each of the input arguments of a function, for example.

regards, tom lane

#5Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#4)
Re: OID wraparound: summary and proposal

Bruce Momjian <pgman@candle.pha.pa.us> writes:

6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the
table has no OIDs. (Alternative possibility: raise an error --- is that
better?) COPY in WITH OIDS will silently drop the incoming OID values.

Obviously, the case here is that COPY WITH OIDS alone on a non-oid table
should throw an error, while pg_dump -o should work on a database with
mixed oid/non-oid. I think the right thing would be to have pg_dump
check pg_class.relhasoids and issue a proper COPY statement to match the
existing table.

pg_dump clearly will need to do that, so it isn't really going to be the
issue. The question is what to do when a less-clueful app issues a COPY
WITH OIDS on an OID-less table. For input, I see no downside to just
ignoring the incoming OIDs. For output, I can see three reasonable
possibilities:

A. Pretend WITH OIDS wasn't mentioned. This might seem to be
"do the right thing", but a rather strong objection is that the
app will not get back the data it was expecting.

B. Return NULLs or 0s for the OIDs column.

C. Raise an error and refuse to do the copy at all.

C is probably the most conservative answer.

If we fail on load, we should fail on dump. Why not fail on COPY WITH
OIDS on a non-oid table?

9. To continue to support COMMENT ON COLUMN when columns have no OIDs,
pg_description will be modified so that its primary key is (object type,
object OID, column number) --- this also solves the problem that comments
break if there are duplicate OIDs in different system tables. The object
type is the OID of the system catalog in which the object OID appears.
The column number field will be zero for all object types except columns.
For a column comment, the object type and OID fields will refer to the
parent table, and column number will be nonzero.

Sounds like a hack.

How so? pg_description is broken anyway given that we don't enforce OID
uniqueness across system catalogs. Also, in the future we could

We have a script to detect them and the oid counter it unique. In what
way do we not enforce it.

consider overloading the <column number> column to have meanings for
other object types. I could imagine using it to attach documentation to
each of the input arguments of a function, for example.

Interesting idea.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#5)
Re: OID wraparound: summary and proposal

Bruce Momjian <pgman@candle.pha.pa.us> writes:

For input, I see no downside to just
ignoring the incoming OIDs. For output, I can see three reasonable
possibilities:

A. Pretend WITH OIDS wasn't mentioned. This might seem to be
"do the right thing", but a rather strong objection is that the
app will not get back the data it was expecting.

B. Return NULLs or 0s for the OIDs column.

C. Raise an error and refuse to do the copy at all.

C is probably the most conservative answer.

If we fail on load, we should fail on dump. Why not fail on COPY WITH
OIDS on a non-oid table?

I'm confused --- I was proposing that we *not* fail on load. What's the
point of failing on load?

How so? pg_description is broken anyway given that we don't enforce OID
uniqueness across system catalogs. Also, in the future we could

We have a script to detect them and the oid counter it unique. In what
way do we not enforce it.

In a running system, once the OID counter wraps around there's no
guarantee that you won't have duplicate OIDs in different system
tables. The only enforcement mechanism we have is the unique indexes,
and those will only check per-table. However, that's fine --- it's
as much as we need. For everything except pg_description, that is.
Since pg_description currently makes an unchecked and uncheckable
assumption of global uniqueness of OIDs, it's broken.

regards, tom lane

#7Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Tom Lane (#1)
Re: OID wraparound: summary and proposal

Tom Lane wrote:

Given Hiroshi's objections, and the likelihood of compatibility problems
for existing applications, I am now thinking that it's not a good idea to
turn off OID generation by default. (At least not for 7.2 --- maybe in
some future release we could change the default.)

Would OIDs be globally unique or per table ?

regards,
Hiroshi Inoue

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#7)
Re: OID wraparound: summary and proposal

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

Tom Lane wrote:

Given Hiroshi's objections, and the likelihood of compatibility problems
for existing applications, I am now thinking that it's not a good idea to
turn off OID generation by default.

Would OIDs be globally unique or per table ?

Same as now: if you have a unique index on 'em, they're unique within a
table; otherwise, no guarantee at all (once the system wraps around).

We should document this state of affairs better, of course, but I'm not
proposing to change it. The point here is just to let people suppress
OIDs for tables that don't need them, and thereby postpone OID wraparound.

regards, tom lane

#9Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Tom Lane (#1)
Re: OID wraparound: summary and proposal

Tom Lane wrote:

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

Tom Lane wrote:

Given Hiroshi's objections, and the likelihood of compatibility problems
for existing applications, I am now thinking that it's not a good idea to
turn off OID generation by default.

Would OIDs be globally unique or per table ?

Same as now: if you have a unique index on 'em, they're unique within a
table; otherwise, no guarantee at all (once the system wraps around).

OIDs per table seems more important than others.

Strangely enough, I've seen no objection to optional OIDs
other than mine. Probably it was my mistake to have formulated
a plan on the flimsy assumption.

regards,
Hiroshi Inoue

#10mlw
markw@mohawksoft.com
In reply to: Tom Lane (#1)
Re: OID wraparound: summary and proposal

Maybe I'm being horribly stupid here, but....

If the thinking is that some tables can escape having an OID, thus meaning OIDs
can be controlled by table, how hard would it be to have an OID range on a per
table basis?

Where each table to have its own notion of an OID, then OID wrap/depletion
should be minimal.

Tom Lane wrote:

Given Hiroshi's objections, and the likelihood of compatibility problems
for existing applications, I am now thinking that it's not a good idea to
turn off OID generation by default. (At least not for 7.2 --- maybe in
some future release we could change the default.)

Based on the discussion so far, here is an attempt to flesh out the
details of what to do with OIDs for 7.2:

1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE.
The default behavior will be WITH OIDS.

Note: there was some discussion of a GUC variable to control the default.
I'm leaning against this, mainly because having one would mean that
pg_dump *must* write WITH OIDS or WITHOUT OIDS in every CREATE TABLE;
else it couldn't be sure that the database schema would be correctly
reconstructed. That would create dump-script portability problems and
negate some of the point of having a GUC variable in the first place.
So I'm thinking a fixed default is better.

Note: an alternative syntax possibility is to make it look like the "with"
option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)".
This is uglier today, but would start to look more attractive if we invent
additional CREATE TABLE options in the future --- there'd be a place to
put 'em. Comments?

2. A child table will be forced to have OIDs if any of its parents do,
even if WITHOUT OIDS is specified in the child's CREATE command. This is
on the theory that the OID ought to act like an inherited column.

3. For a table without OIDs, no entry will be made in pg_attribute for
the OID column, so an attempt to reference the OID column will draw a
"no such column" error. (An alternative is to allow OID to read as nulls,
but it seemed that people preferred the error to be raised.)

4. When inserting into an OID-less table, the INSERT result string will
always show 0 for the OID.

5. A "relhasoids" boolean column will be added to pg_class to signal
whether a table has OIDs or not.

6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the
table has no OIDs. (Alternative possibility: raise an error --- is that
better?) COPY in WITH OIDS will silently drop the incoming OID values.

7. Physical tuple headers won't change. If no OIDs are assigned for a
particular table, the OID field in the header will be left zero.

8. OID generation will be disabled for those system tables that don't need
it --- pg_listener, pg_largeobject, and pg_attribute being some major
offenders that consume lots of OIDs.

9. To continue to support COMMENT ON COLUMN when columns have no OIDs,
pg_description will be modified so that its primary key is (object type,
object OID, column number) --- this also solves the problem that comments
break if there are duplicate OIDs in different system tables. The object
type is the OID of the system catalog in which the object OID appears.
The column number field will be zero for all object types except columns.
For a column comment, the object type and OID fields will refer to the
parent table, and column number will be nonzero.

10. pg_dump will be modified to do the appropriate things with OIDs. Are
there any other application programs that need to change?

We had also talked about adding an INSERT ... RETURNING feature to allow
applications to eliminate their dependence on looking at the OID returned
by an INSERT command. I think this is a good idea, but there are still
a number of unsolved issues about how it should interact with rules.
Accordingly, I'm not going to try to include it in this batch of work.

Comments?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
5-4-3-2-1 Thunderbirds are GO!
------------------------
http://www.mohawksoft.com

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: mlw (#10)
Re: OID wraparound: summary and proposal

mlw <markw@mohawksoft.com> writes:

how hard would it be to have an OID range on a per
table basis?

The existing OID generator is a system-wide counter, and couldn't
reasonably be expected to do something like that.

There was some talk of (in essence) eliminating the present OID
generator mechanism and giving each table its own sequence object for
generating per-table OIDs. It's an interesting thought, but I'm
concerned about the overhead involved. At the very least we'd need to
reimplement sequence objects in a lower-overhead fashion (eg, make 'em
rows in a pg_sequence table rather than free-standing almost-tables).

Might be worth doing someday, but I think it's orthogonal to what I'm
proposing at present. There'd still be a need to suppress OID
generation on tables that don't need OIDs and might have more than
4 billion inserts during their lifetime.

regards, tom lane

#12Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#6)
Re: OID wraparound: summary and proposal

Bruce Momjian <pgman@candle.pha.pa.us> writes:

For input, I see no downside to just
ignoring the incoming OIDs. For output, I can see three reasonable
possibilities:

A. Pretend WITH OIDS wasn't mentioned. This might seem to be
"do the right thing", but a rather strong objection is that the
app will not get back the data it was expecting.

B. Return NULLs or 0s for the OIDs column.

C. Raise an error and refuse to do the copy at all.

C is probably the most conservative answer.

If we fail on load, we should fail on dump. Why not fail on COPY WITH
OIDS on a non-oid table?

I'm confused --- I was proposing that we *not* fail on load. What's the
point of failing on load?

I meant to say we should fail on dump _and_ load. If we don't we are
throwing away the oid's they are loading because though the table has no
oid column. Seems like something that should fail.

How so? pg_description is broken anyway given that we don't enforce OID
uniqueness across system catalogs. Also, in the future we could

We have a script to detect them and the oid counter it unique. In what
way do we not enforce it.

In a running system, once the OID counter wraps around there's no
guarantee that you won't have duplicate OIDs in different system
tables. The only enforcement mechanism we have is the unique indexes,
and those will only check per-table. However, that's fine --- it's
as much as we need. For everything except pg_description, that is.
Since pg_description currently makes an unchecked and uncheckable
assumption of global uniqueness of OIDs, it's broken.

If you consider random table creation failures acceptible. In oid
wraparound, whether pg_description could point to two rows with the same
oid is the smallest part of our problem. I think the whole idea we can
run reliably with an oid wraparound is questionable.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#13Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Bruce Momjian (#12)
AW: OID wraparound: summary and proposal

Strangely enough, I've seen no objection to optional OIDs
other than mine. Probably it was my mistake to have formulated
a plan on the flimsy assumption.

I for one am more concerned about adding additional per
tuple overhead (moving from 32 -> 64bit) than loosing OID's
on some large tables. Imho optional OID's is the best way to combine
both worlds. OID's only where you absolutely need them, and thus
a good chance that wraparound does not happen during the lifetime of
one application. (And all this by reducing overhead, and not adding
overhead :-)

Andreas

#14Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Zeugswetter Andreas SB (#13)
Re: AW: OID wraparound: summary and proposal

Strangely enough, I've seen no objection to optional OIDs
other than mine. Probably it was my mistake to have formulated
a plan on the flimsy assumption.

I for one am more concerned about adding additional per
tuple overhead (moving from 32 -> 64bit) than loosing OID's
on some large tables. Imho optional OID's is the best way to combine
both worlds. OID's only where you absolutely need them, and thus
a good chance that wraparound does not happen during the lifetime of
one application. (And all this by reducing overhead, and not adding
overhead :-)

Agreed, the big selling point for me and optional oid's was removing
their overhead from the tuple header. We need to trim that baby down!

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#15Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#1)
Re: Re: OID wraparound: summary and proposal

Tom Lane wrote:

Given Hiroshi's objections, and the likelihood of compatibility problems
for existing applications, I am now thinking that it's not a good idea to
turn off OID generation by default. (At least not for 7.2 --- maybe in
some future release we could change the default.)

Based on the discussion so far, here is an attempt to flesh out the
details of what to do with OIDs for 7.2:

Also OIDS should be promoted to 8-byte integers at some future time.

1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE.
The default behavior will be WITH OIDS.

Note: there was some discussion of a GUC variable to control the default.
I'm leaning against this, mainly because having one would mean that
pg_dump *must* write WITH OIDS or WITHOUT OIDS in every CREATE TABLE;
else it couldn't be sure that the database schema would be correctly
reconstructed. That would create dump-script portability problems and
negate some of the point of having a GUC variable in the first place.
So I'm thinking a fixed default is better.

Note: an alternative syntax possibility is to make it look like the "with"
option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)".
This is uglier today, but would start to look more attractive if we invent
additional CREATE TABLE options in the future --- there'd be a place to
put 'em. Comments?

2. A child table will be forced to have OIDs if any of its parents do,
even if WITHOUT OIDS is specified in the child's CREATE command. This is
on the theory that the OID ought to act like an inherited column.

3. For a table without OIDs, no entry will be made in pg_attribute for
the OID column, so an attempt to reference the OID column will draw a
"no such column" error. (An alternative is to allow OID to read as nulls,
but it seemed that people preferred the error to be raised.)

4. When inserting into an OID-less table, the INSERT result string will
always show 0 for the OID.

5. A "relhasoids" boolean column will be added to pg_class to signal
whether a table has OIDs or not.

6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the
table has no OIDs. (Alternative possibility: raise an error --- is that
better?) COPY in WITH OIDS will silently drop the incoming OID values.

7. Physical tuple headers won't change. If no OIDs are assigned for a
particular table, the OID field in the header will be left zero.

8. OID generation will be disabled for those system tables that don't need
it --- pg_listener, pg_largeobject, and pg_attribute being some major
offenders that consume lots of OIDs.

1-8 sounds good

9. To continue to support COMMENT ON COLUMN when columns have no OIDs,
pg_description will be modified so that its primary key is (object type,
object OID, column number) --- this also solves the problem that comments
break if there are duplicate OIDs in different system tables.

Hm.. To me this sounds like allowing duplicates in an unique index in
case
there happen to be duplicate keys there ;)

IMHO duplicate OID's in system tables should be treated as bug's - if
they
are there they are meant to break stuff.

The object
type is the OID of the system catalog in which the object OID appears.
The column number field will be zero for all object types except columns.
For a column comment, the object type and OID fields will refer to the
parent table, and column number will be nonzero.

What happens to columns added to inherited tables ?

Show quoted text

10. pg_dump will be modified to do the appropriate things with OIDs. Are
there any other application programs that need to change?

We had also talked about adding an INSERT ... RETURNING feature to allow
applications to eliminate their dependence on looking at the OID returned
by an INSERT command. I think this is a good idea, but there are still
a number of unsolved issues about how it should interact with rules.
Accordingly, I'm not going to try to include it in this batch of work.

#16Hannu Krosing
hannu@tm.ee
In reply to: Zeugswetter Andreas SB (#13)
Re: OID wraparound: summary and proposal

Nathan Myers wrote:

On Thu, Aug 02, 2001 at 09:28:18AM +0200, Zeugswetter Andreas SB wrote:

Strangely enough, I've seen no objection to optional OIDs
other than mine. Probably it was my mistake to have formulated
a plan on the flimsy assumption.

I for one am more concerned about adding additional per
tuple overhead (moving from 32 -> 64bit) than loosing OID's
on some large tables. Imho optional OID's is the best way to combine
both worlds.

At the same time that we announce support for optional OIDs,
we should announce that, in future releases, OIDs will only be
guaranteed unique (modulo wraparounds) within a single table.

What would the purpose of such an announcement be ???

OID is "Object IDentifier", meant to uniquely identify ANY object in an
Object-Relational Database ,which PostgreSQL sometimes claims itself to
be.

If they are unique only within a single table then they are just
system-supplied primary key fields without a default index - quite
useless IMHO

I hope someone takes up the task of putting back some of the
niftier features of original Postgres/postgres95 and adding more OO
features. Deprecating OIDs won't help there .

--------------------
Hannu

#17Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#1)
Re: Re: OID wraparound: summary and proposal

Tom Lane wrote:

Hannu Krosing <hannu@tm.ee> writes:

9. To continue to support COMMENT ON COLUMN when columns have no OIDs,
pg_description will be modified so that its primary key is (object type,
object OID, column number) --- this also solves the problem that comments
break if there are duplicate OIDs in different system tables.

Hm.. To me this sounds like allowing duplicates in an unique index in
case there happen to be duplicate keys there ;)

Unless you want to implement a global unique index

Or insert/update trigger

that can enforce uniqueness of OIDs across all the system tables, I don't
think that approach is tenable.

As I wrote in another mail to this list, AFAIK OID is supposed to be
Object Identifier - something that can be used to identify any object
in a unique fashion.

When (and if ;) we will implement SQL3's UNDER, we should, IMHO, make a
primary
key inherited and *unique* over all tables created UNDER the main table,
meaning
that we will need a way to have uniqueness constraint spanning multiple
tables.

( At least logically multiple tables, as IMHO UNDER with its single
inheritance is
best implemented in a single table with a bit more flexible column
structure. )

At that time we could theoretically inherit all system tables that have
OID
column from table "pg_system(oid oid primary key);"

pg_description is broken as it stands. Bruce
doesn't like the "column number" part of my proposal --- I suppose he'd
rather see the pg_description key as just <object type, object OID> with
object type referring to pg_attribute if it's a comment on column.
That would work too as far as fixing the lack of uniqueness goes, but it
still leaves us with pg_attribute as a significant consumer of OIDs.

That would probably be a problem with 4-byte OIDs, there is an ample
supply of 8-byte ones

I do like dropping OID from pg_listener, as it is a mostly empty and
really
rapidly changing table, but I see little value in dropping oid from
pg_attribute.

BTW, don't indexes, triggers or saved plans use OIDs from pg_attribute ?

Since the major point of this exercise (in my mind) is cutting the rate
of consumption of OIDs to postpone wraparound, I want to suppress OIDs
in pg_attribute, and to do that I have to add the column number to
pg_description.

I still think that going to 8-byte OIDs would be the best use of your
time ;)

If you can make the size of oid's a compile time option, then even
better.

Postponing the wraparound by the means you describe may be a fools
errand anyway,
as there are other ways to quickly consume oids that are very likely as
common as
those involving pg_listener, pg_largeobject, and pg_attribute.

Also computers still get faster, and disks still get bigger at the rate
I doubt
you will be able to match by finding ways to postpone the wraparound.

So here I'd like to contradict Vadim's claim that the time of simple
solutions is
over for PostgreSQL - making OID bigger is at least conceptually simple,
it's just
"a small matter of programming" ;)

--------------
Hannu

#18Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#1)
Re: Re: OID wraparound: summary and proposal

mlw wrote:

I posted this question earlier, but it looks like it never made it on.

If you can control the OIDs on a per table basis, and some tables need not even
have any, why not let each table have its own OID range? Essentially, each
record will be numbered relative to 0 on its table?

That would really cut down the OID wrap around problem, and allow records to
have a notion of serialization.

What would the meaning of such an "OID" be ?

Apart from duplicating the primary key that is ?

------------------
Hannu

#19Noname
ncm@zembu.com
In reply to: Zeugswetter Andreas SB (#13)
Re: OID wraparound: summary and proposal

On Thu, Aug 02, 2001 at 09:28:18AM +0200, Zeugswetter Andreas SB wrote:

Strangely enough, I've seen no objection to optional OIDs
other than mine. Probably it was my mistake to have formulated
a plan on the flimsy assumption.

I for one am more concerned about adding additional per
tuple overhead (moving from 32 -> 64bit) than loosing OID's
on some large tables. Imho optional OID's is the best way to combine
both worlds.

At the same time that we announce support for optional OIDs,
we should announce that, in future releases, OIDs will only be
guaranteed unique (modulo wraparounds) within a single table.

Nathan Myers
ncm@zembu.com

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#19)
Re: OID wraparound: summary and proposal

ncm@zembu.com (Nathan Myers) writes:

At the same time that we announce support for optional OIDs,
we should announce that, in future releases, OIDs will only be
guaranteed unique (modulo wraparounds) within a single table.

Seems reasonable --- that will give people notice that we're thinking
about separate-OID-generator-per-table ideas.

Right now we don't really document any of these considerations,
but I plan to write something as part of the work I'm about to do.

regards, tom lane

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#15)
Re: Re: OID wraparound: summary and proposal

Hannu Krosing <hannu@tm.ee> writes:

Tom Lane wrote:

Based on the discussion so far, here is an attempt to flesh out the
details of what to do with OIDs for 7.2:

Also OIDS should be promoted to 8-byte integers at some future time.

Perhaps, but I'm trying to focus on what to do for 7.2...

9. To continue to support COMMENT ON COLUMN when columns have no OIDs,
pg_description will be modified so that its primary key is (object type,
object OID, column number) --- this also solves the problem that comments
break if there are duplicate OIDs in different system tables.

Hm.. To me this sounds like allowing duplicates in an unique index in
case there happen to be duplicate keys there ;)

Unless you want to implement a global unique index that can enforce
uniqueness of OIDs across all the system tables, I don't think that
approach is tenable. pg_description is broken as it stands. Bruce
doesn't like the "column number" part of my proposal --- I suppose he'd
rather see the pg_description key as just <object type, object OID> with
object type referring to pg_attribute if it's a comment on column.
That would work too as far as fixing the lack of uniqueness goes, but it
still leaves us with pg_attribute as a significant consumer of OIDs.
Since the major point of this exercise (in my mind) is cutting the rate
of consumption of OIDs to postpone wraparound, I want to suppress OIDs
in pg_attribute, and to do that I have to add the column number to
pg_description.

The column number field will be zero for all object types except columns.
For a column comment, the object type and OID fields will refer to the
parent table, and column number will be nonzero.

What happens to columns added to inherited tables ?

Uh, nothing as far as I can see. We don't presently support auto
inheritance of comments-on-columns, if that's what you were asking for.
Offhand, making that happen seems about equally easy with either
representation of pg_description, so I don't think it's an issue.

regards, tom lane

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#1)
Re: OID wraparound: summary and proposal

mlw <markw@mohawksoft.com> writes:

I posted this question earlier, but it looks like it never made it on.

You did post it, and I answered it: no can do with anything close to the
current implementation of the OID generator. We have one counter for
the whole system, not per-table state.

regards, tom lane

#23mlw
markw@mohawksoft.com
In reply to: Tom Lane (#1)
Re: OID wraparound: summary and proposal

I posted this question earlier, but it looks like it never made it on.

If you can control the OIDs on a per table basis, and some tables need not even
have any, why not let each table have its own OID range? Essentially, each
record will be numbered relative to 0 on its table?

That would really cut down the OID wrap around problem, and allow records to
have a notion of serialization.

Tom Lane wrote:

Given Hiroshi's objections, and the likelihood of compatibility problems
for existing applications, I am now thinking that it's not a good idea to
turn off OID generation by default. (At least not for 7.2 --- maybe in
some future release we could change the default.)

Based on the discussion so far, here is an attempt to flesh out the
details of what to do with OIDs for 7.2:

1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE.
The default behavior will be WITH OIDS.

Note: there was some discussion of a GUC variable to control the default.
I'm leaning against this, mainly because having one would mean that
pg_dump *must* write WITH OIDS or WITHOUT OIDS in every CREATE TABLE;
else it couldn't be sure that the database schema would be correctly
reconstructed. That would create dump-script portability problems and
negate some of the point of having a GUC variable in the first place.
So I'm thinking a fixed default is better.

Note: an alternative syntax possibility is to make it look like the "with"
option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)".
This is uglier today, but would start to look more attractive if we invent
additional CREATE TABLE options in the future --- there'd be a place to
put 'em. Comments?

2. A child table will be forced to have OIDs if any of its parents do,
even if WITHOUT OIDS is specified in the child's CREATE command. This is
on the theory that the OID ought to act like an inherited column.

3. For a table without OIDs, no entry will be made in pg_attribute for
the OID column, so an attempt to reference the OID column will draw a
"no such column" error. (An alternative is to allow OID to read as nulls,
but it seemed that people preferred the error to be raised.)

4. When inserting into an OID-less table, the INSERT result string will
always show 0 for the OID.

5. A "relhasoids" boolean column will be added to pg_class to signal
whether a table has OIDs or not.

6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the
table has no OIDs. (Alternative possibility: raise an error --- is that
better?) COPY in WITH OIDS will silently drop the incoming OID values.

7. Physical tuple headers won't change. If no OIDs are assigned for a
particular table, the OID field in the header will be left zero.

8. OID generation will be disabled for those system tables that don't need
it --- pg_listener, pg_largeobject, and pg_attribute being some major
offenders that consume lots of OIDs.

9. To continue to support COMMENT ON COLUMN when columns have no OIDs,
pg_description will be modified so that its primary key is (object type,
object OID, column number) --- this also solves the problem that comments
break if there are duplicate OIDs in different system tables. The object
type is the OID of the system catalog in which the object OID appears.
The column number field will be zero for all object types except columns.
For a column comment, the object type and OID fields will refer to the
parent table, and column number will be nonzero.

10. pg_dump will be modified to do the appropriate things with OIDs. Are
there any other application programs that need to change?

We had also talked about adding an INSERT ... RETURNING feature to allow
applications to eliminate their dependence on looking at the OID returned
by an INSERT command. I think this is a good idea, but there are still
a number of unsolved issues about how it should interact with rules.
Accordingly, I'm not going to try to include it in this batch of work.

Comments?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
5-4-3-2-1 Thunderbirds are GO!
------------------------
http://www.mohawksoft.com

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#17)
Re: Re: OID wraparound: summary and proposal

Hannu Krosing <hannu@tm.ee> writes:

That would probably be a problem with 4-byte OIDs, there is an ample
supply of 8-byte ones

Sure, but I think we are still a few years away from being able to
assume that every platform of interest can support 8-byte OIDs (and
furthermore, won't see a significant performance degradation --- keep
in mind that widening Datum to 8 bytes is a change that affects all
datatypes not just Oid). There's also the Oids-are-in-the-wire-protocol
problem. In short, that's a long-term solution not a near-term one.

BTW, don't indexes, triggers or saved plans use OIDs from pg_attribute ?

Nope. pg_description is the only offender.

regards, tom lane

#25Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Zeugswetter Andreas SB (#13)
Re: AW: OID wraparound: summary and proposal

Zeugswetter Andreas SB wrote:

Strangely enough, I've seen no objection to optional OIDs
other than mine. Probably it was my mistake to have formulated
a plan on the flimsy assumption.

I for one am more concerned about adding additional per
tuple overhead (moving from 32 -> 64bit) than loosing OID's
on some large tables. Imho optional OID's is the best way to combine
both worlds. OID's only where you absolutely need them, and thus
a good chance that wraparound does not happen during the lifetime of
one application. (And all this by reducing overhead, and not adding
overhead :-)

Hmm there seems to be an assumption that people could
know whether they need OID or not for each table.
I've had a plan in ODBC using OID and TID.
Few ODBC users know about ODBC spec. They rarely use
ODBC directly and use middlewares like Access etc.
Could they take care of the necessity of OIDs with
my plan ? Could they know when/how the middlewares
use my new feature effectively ? To tell the truth,
I don't know it precisely.
OK, a user decided to create tables with OIDs unco
nditionally for ODBC but he may encounter the OID
wraparound problem instead....
I don't think that people use the feature with such
silly restrictions.

regards,
Hiroshi Inoue

#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#25)
Re: AW: OID wraparound: summary and proposal

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

Hmm there seems to be an assumption that people could
know whether they need OID or not for each table.

A good point, and one reason not to make no-OIDs the default. I'm
envisioning that people will turn off OIDs only for tables that they
know will be very large and that they know they don't need OIDs for.

regards, tom lane

#27Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Zeugswetter Andreas SB (#13)
Re: AW: OID wraparound: summary and proposal

Tom Lane wrote:

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

Hmm there seems to be an assumption that people could
know whether they need OID or not for each table.

A good point, and one reason not to make no-OIDs the default. I'm
envisioning that people will turn off OIDs only for tables that they
know will be very large and that they know they don't need OIDs for.

AFAIK few people have voted *OIDs by default* in the
first place. It seems to mean that *default* would
naturally(essentially) be changed to *WITH NO OIDS*.
The followings are the result of vote which I remember
well.

regards,
Hiroshi Inoue

"Mikheev, Vadim" wrote:

OK, we need to vote on whether Oid's are optional,
and whether we can have them not created by default.

Optional OIDs: YES
No OIDs by default: YES

Lamar Owen wrote:

Show quoted text

[trimmed cc:list]
On Wednesday 18 July 2001 17:09, Bruce Momjian wrote:

OK, we need to vote on whether Oid's are optional, and whether we can
have them not created by default.

[All the below IMHO]

OID's should be optional.

System tables that absolutely have to have OIDs may keep them.

No new OID usage, period. Use some other unique primary key.

Default user tables to no OIDs.

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#27)
Re: AW: OID wraparound: summary and proposal

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

The followings are the result of vote which I remember
well.

FWIW, I changed my vote ;-). I'm not sure what Vadim and Lamar think
at the moment, but I thought you made good arguments.

regards, tom lane

#29mlw
markw@mohawksoft.com
In reply to: Tom Lane (#1)
Re: OID wraparound: summary and proposal

Tom Lane wrote:

mlw <markw@mohawksoft.com> writes:

I posted this question earlier, but it looks like it never made it on.

You did post it,

Sorry, it never got to me.

and I answered it: no can do with anything close to the
current implementation of the OID generator. We have one counter for
the whole system, not per-table state.

That's a bummer. The concept of a ROWID is really useful, especially for those
that come from an Oracle background, or porting Oracle queries.

--
5-4-3-2-1 Thunderbirds are GO!
------------------------
http://www.mohawksoft.com

#30mlw
markw@mohawksoft.com
In reply to: Tom Lane (#1)
Re: OID wraparound: summary and proposal

Hannu Krosing wrote:

mlw wrote:

I posted this question earlier, but it looks like it never made it on.

If you can control the OIDs on a per table basis, and some tables need not even
have any, why not let each table have its own OID range? Essentially, each
record will be numbered relative to 0 on its table?

That would really cut down the OID wrap around problem, and allow records to
have a notion of serialization.

What would the meaning of such an "OID" be ?

Apart from duplicating the primary key that is ?

Some other databases have the notion of a ROWID which uniquely identifies a row
within a table. OID can be used for that, but it means if you use it, you must
limit the size of your whole database system. The other alternative is to make
a column called "rowid" and a sequence for it and a default of
nextval('table_rowid'). That means more work for those porting.

My thinking was that if the "OID" issue was being addressed, maybe it could be
changed quite a bit. The problem with the current OID is that it severely
limits the capacity of the database AND does not carry with it enough
information.

For instance, as far as I can see, one can not take an OID and make any sort of
determination about what it is. One also needs to know the table and the
database from which it was retrieved. So an OID is meaningless without the
contextual information. Why should it be a system wide limitation when it needs
to be used in the context of a specific table?

That way PostgreSQL has a knowable 4B (or 2B signed) record limit per table,
not per system. One could create a new virtual OID like thing, called SYSOID,
or something, which is a 64 bit value, the upper 4 bytes being the OID of the
table from the catalog, and the lower 4 bytes being the OID of the record.

The SYSOID would really tell you something! Given a SYSOID you could find the
database, the table, and the record.

--
5-4-3-2-1 Thunderbirds are GO!
------------------------
http://www.mohawksoft.com

#31mlw
markw@mohawksoft.com
In reply to: Tom Lane (#1)
Re: OID wraparound: summary and proposal

Tom Lane wrote:

[Snipped]

I think making "WITHOUT OIDS" the default for table creation is the right thing
to do. Here is my reasoning:

An OID is a system wide limitation. 4B or 2B depending on sign-ness. (could
there be some bugs still lurking on high OIDs?) Since the OID is a shared
system wide limited resource, it should be limited to "system" tables which
require it.

To new comers to PostgreSQL this limitation will not be obvious until they hit
it. Then they will kick themselves for not reading more carefully.

An OID does not add any real value to the database developer. Given an OID, one
can not determine anything about the record it represents. One also needs the
table and database from which it came, and even then one has to create an index
on the OID column on the table to get to the record efficiently. It can only
indicate the order in which records were entered.

If people need something like OID for their tables, the documented "preferred
way" could be:
create sequence fubar;
create table fubar
(
rowid integer default nextval('fubar_seq'),
...
);

Then explain that they can use "WITH OID" but there is a system wide limit.

On a side note: I know it is probably a lot of work, and it has been shot down
once, but the notion of a rowid built into a table would be useful. It would
solve wrap around and keep the useful functionality of OID, and be more
efficient and robust than using the sequence.

#32Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#20)
Re: OID wraparound: summary and proposal

ncm@zembu.com (Nathan Myers) writes:

At the same time that we announce support for optional OIDs,
we should announce that, in future releases, OIDs will only be
guaranteed unique (modulo wraparounds) within a single table.

Seems reasonable --- that will give people notice that we're thinking
about separate-OID-generator-per-table ideas.

Right now we don't really document any of these considerations,
but I plan to write something as part of the work I'm about to do.

But why do that if we have sequences?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#33Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#28)
Re: AW: OID wraparound: summary and proposal

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

The followings are the result of vote which I remember
well.

FWIW, I changed my vote ;-). I'm not sure what Vadim and Lamar think
at the moment, but I thought you made good arguments.

I think Vadim was clearly NOOID. I vote OID.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#34Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Bruce Momjian (#33)
AW: AW: OID wraparound: summary and proposal

FWIW, I changed my vote ;-). I'm not sure what Vadim and Lamar think
at the moment, but I thought you made good arguments.

I think Vadim was clearly NOOID. I vote OID.

NOOID, but I can see the arguments on the other side :-)
I would also vote GUC despite the pg_dump issue.

Andreas

#35Dave Cramer
dave@fastcrypt.com
In reply to: Bruce Momjian (#32)
RE: OID wraparound: summary and proposal

I'm not sure this is related to the OID discussion, however I have seen
designs where a unique id is required for all the objects in the
database.

This (IMO) this implies an int8 (or larger) sequence number.

It would be nice if we could have different size sequences. Just thought
I'd throw that in.

Dave

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Bruce Momjian
Sent: August 3, 2001 9:22 AM
To: Tom Lane
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] OID wraparound: summary and proposal

ncm@zembu.com (Nathan Myers) writes:

At the same time that we announce support for optional OIDs, we
should announce that, in future releases, OIDs will only be
guaranteed unique (modulo wraparounds) within a single table.

Seems reasonable --- that will give people notice that we're thinking
about separate-OID-generator-per-table ideas.

Right now we don't really document any of these considerations, but I
plan to write something as part of the work I'm about to do.

But why do that if we have sequences?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania
19026

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#36Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: mlw (#30)
RE: Re: OID wraparound: summary and proposal

The analog of ROWID in PostgreSQL is TID rather than OID
because TID is a physical address of a tuple within a table.
However there's a significant difference. Unfortunately TID
is transient. It is changed by UPDATE and VACUUM.
Though TIDs are unavailable for critical use, OIDs could
compensate the drawback. TIDs and OIDs must help each
other if PostgreSQL needs the concept like ROWID.

regards,
Hiroshi Inoue

Show quoted text

-----Original Message-----
From: mlw

Hannu Krosing wrote:

mlw wrote:

I posted this question earlier, but it looks like it never made it on.

If you can control the OIDs on a per table basis, and some

tables need not even

have any, why not let each table have its own OID range?

Essentially, each

record will be numbered relative to 0 on its table?

That would really cut down the OID wrap around problem, and

allow records to

have a notion of serialization.

What would the meaning of such an "OID" be ?

Apart from duplicating the primary key that is ?

Some other databases have the notion of a ROWID which uniquely
identifies a row
within a table. OID can be used for that, but it means if you use
it, you must
limit the size of your whole database system. The other
alternative is to make
a column called "rowid" and a sequence for it and a default of
nextval('table_rowid'). That means more work for those porting.

#37mlw
markw@mohawksoft.com
In reply to: Hiroshi Inoue (#36)
Re: Re: OID wraparound: summary and proposal

Hiroshi Inoue wrote:

The analog of ROWID in PostgreSQL is TID rather than OID
because TID is a physical address of a tuple within a table.
However there's a significant difference. Unfortunately TID
is transient. It is changed by UPDATE and VACUUM.
Though TIDs are unavailable for critical use, OIDs could
compensate the drawback. TIDs and OIDs must help each
other if PostgreSQL needs the concept like ROWID.

That is true now, but I am saying that it should not be true. Rather than have
a single limited global resource, the current OID, if possible, tables should
get their own notion of an OID, like a ROWID.

The ability to eliminated OID from tables is a great step, but, if one needs a
OID behavior on tables, then one has a limit of 2B-4B rows in an entire
database system for which all tables compete.

You have even said you need the notion of an OID for some ODBC cursor stuff you
are doing. Thus eliminating OIDs is not an option for you.

The options are:
No OID on a table. This breaks any code that assumes an OID must always exist.
Use OIDs on a table. This limits the size of the database, I have already had
to drop and reload a database once because of OID depletion (3 months).

If OIDs can become the equivalent of a ROWID, then code designed that assumes
OID are always valid will still work, and Postgres will not run out of OIDs in
system wide sense.

I know I won't be doing the work to make the changes, so I am sensitive to that
issue, but as a PostgreSQL user, I can say that I have hit the OID limit once
already and will continue to hit it periodically. Getting rid of OIDs may not
be an option for me because I planning to do some replication across several
boxes, and that means I would use OID or use a sequence and "default
nextval(...)."

--
5-4-3-2-1 Thunderbirds are GO!
------------------------
http://www.mohawksoft.com

#38Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#36)
Re: Re: OID wraparound: summary and proposal

mlw wrote:

Hiroshi Inoue wrote:

The analog of ROWID in PostgreSQL is TID rather than OID
because TID is a physical address of a tuple within a table.
However there's a significant difference. Unfortunately TID
is transient. It is changed by UPDATE and VACUUM.
Though TIDs are unavailable for critical use, OIDs could
compensate the drawback. TIDs and OIDs must help each
other if PostgreSQL needs the concept like ROWID.

That is true now, but I am saying that it should not be true. Rather than have
a single limited global resource, the current OID, if possible, tables should
get their own notion of an OID, like a ROWID.

I've objected optional OID but never objected OIDs per table.
OIDs per table is more important than others IMHO.

regards,
Hiroshi Inoue

#39Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Hiroshi Inoue (#38)
AW: Re: OID wraparound: summary and proposal

Some other databases have the notion of a ROWID which uniquely

identifies a row

within a table. OID can be used for that, but it means if you use it,

you must

limit the size of your whole database system.

Imho that is getting it all wrong. OID is *not* a suitable substitute
for other
db's ROWID.

If you take a few extra precautions then you can use XTID in PostgreSQL
instead of other's ROWID.

We often hear, that it is safer to use ROWID in Oracle and Informix than
in
PostgreSQL. It is only true that the risc of getting at the wrong record
is
lower. Are you going to take chances when manipulating rows ? NO !
Thus any sensible program working on ROWID's will have builtin
precautions,
like locking the table, or using additional where quals.

I am still of the opinion, that we should invent an alias ROWID at the
SQL level
for the current XTID. I do not think that it matters what datatype this
ROWID is,
an arbitrary string like xtid is sufficient, it does not need to be an
integer.

Andreas

#40mlw
markw@mohawksoft.com
In reply to: Zeugswetter Andreas SB SD (#39)
Re: AW: Re: OID wraparound: summary and proposal

I think you are focusing too much on "ROWID" and not enough on OID. The issue
at hand is OID. It is a PostgreSQL cluster wide limitation. As data storage
decreases in price, the likelihood of people running into this limitation
increases. I have run into OID problems in my curent project. Geez, 40G 7200
RPM drives are $120, amazing.

Tom has proposed being able to remove the OID from tables, to preserve this
resource. I originally thought this was a good idea, but there are tools and
utilities others may want to use in the future that require OIDs, thus they
would have to be re-written or abandoned altogether.

It seems to me, I guess and others too, that the OID mechanism should be on a
per table basis. That way OIDs are much more likely to be unique, and TRUNCATE
on a table should reset it's OID counter to zero.

Zeugswetter Andreas SB SD wrote:

Some other databases have the notion of a ROWID which uniquely

identifies a row

within a table. OID can be used for that, but it means if you use it,

you must

limit the size of your whole database system.

Imho that is getting it all wrong. OID is *not* a suitable substitute
for other
db's ROWID.

If you take a few extra precautions then you can use XTID in PostgreSQL
instead of other's ROWID.

We often hear, that it is safer to use ROWID in Oracle and Informix than
in
PostgreSQL. It is only true that the risc of getting at the wrong record
is
lower. Are you going to take chances when manipulating rows ? NO !
Thus any sensible program working on ROWID's will have builtin
precautions,
like locking the table, or using additional where quals.

I am still of the opinion, that we should invent an alias ROWID at the
SQL level
for the current XTID. I do not think that it matters what datatype this
ROWID is,
an arbitrary string like xtid is sufficient, it does not need to be an
integer.

Andreas

--
5-4-3-2-1 Thunderbirds are GO!
------------------------
http://www.mohawksoft.com

#41Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: mlw (#40)
RE: AW: Re: OID wraparound: summary and proposal

It seems to me, I guess and others too, that the OID mechanism should

be on a

per table basis. That way OIDs are much more likely to be unique, and

TRUNCATE

on a table should reset it's OID counter to zero.

Seems to me, that this would be no different than a performance improved
version
of SERIAL.
If you really need OID, you imho want the systemid tableid tupleid
combo.
A lot of people seem to use OID, when they really could use XTID. That
is
what I wanted to say.

Andreas

#42Alex Pilosov
alex@pilosoft.com
In reply to: mlw (#40)
Re: AW: Re: OID wraparound: summary and proposal

On Mon, 6 Aug 2001, mlw wrote:

I think you are focusing too much on "ROWID" and not enough on OID. The issue
at hand is OID. It is a PostgreSQL cluster wide limitation. As data storage
decreases in price, the likelihood of people running into this limitation
increases. I have run into OID problems in my curent project. Geez, 40G 7200
RPM drives are $120, amazing.

Possibly you were using OIDs for what they weren't intended ;)

Tom has proposed being able to remove the OID from tables, to preserve
this resource. I originally thought this was a good idea, but there
are tools and utilities others may want to use in the future that
require OIDs, thus they would have to be re-written or abandoned
altogether.

What are these tools?

It seems to me, I guess and others too, that the OID mechanism should be on a
per table basis. That way OIDs are much more likely to be unique, and TRUNCATE
on a table should reset it's OID counter to zero.

I disagree. OID as it is now is a mandatory SERIAL that is added to every
table. Most tables don't need such a field, those which do, well, they can
keep it as it is now (global per-database), or, if you want per-table
sequence, just create a SERIAL field explicitly.

#43mlw
markw@mohawksoft.com
In reply to: Zeugswetter Andreas SB SD (#41)
Re: AW: Re: OID wraparound: summary and proposal

Zeugswetter Andreas SB SD wrote:

It seems to me, I guess and others too, that the OID mechanism should

be on a

per table basis. That way OIDs are much more likely to be unique, and

TRUNCATE

on a table should reset it's OID counter to zero.

Seems to me, that this would be no different than a performance improved
version
of SERIAL.
If you really need OID, you imho want the systemid tableid tupleid
combo.
A lot of people seem to use OID, when they really could use XTID. That
is
what I wanted to say.

I don't care about having an OID or ROWID, I care that there is a 2^32 limit to
the current OID strategy and that a quick fix of allowing tables to exist
without OIDs may break some existing software. I was suggesting the OIDs be
managed on a "per table" basis as a better solution.

In reality, a 32 bit OID, even isolated per table, may be too small. Databases
are getting HUGE. 40G disk drives are less than $100 bucks, in a few months 80G
drives will be less than $200, one can put together 200G RAID systems for about
$1000, a terabyte for about $5000. A database that would have needed an
enterprise level system, just 7 years ago, can be run on a $500 desktop today.

--
5-4-3-2-1 Thunderbirds are GO!
------------------------
http://www.mohawksoft.com

#44Alex Pilosov
alex@pilosoft.com
In reply to: mlw (#43)
Re: Re: AW: Re: OID wraparound: summary and proposal

On Mon, 6 Aug 2001, mlw wrote:

Zeugswetter Andreas SB SD wrote:

It seems to me, I guess and others too, that the OID mechanism should

be on a

per table basis. That way OIDs are much more likely to be unique, and

TRUNCATE

on a table should reset it's OID counter to zero.

Seems to me, that this would be no different than a performance improved
version
of SERIAL.
If you really need OID, you imho want the systemid tableid tupleid
combo.
A lot of people seem to use OID, when they really could use XTID. That
is
what I wanted to say.

I don't care about having an OID or ROWID, I care that there is a 2^32 limit to
the current OID strategy and that a quick fix of allowing tables to exist
without OIDs may break some existing software. I was suggesting the OIDs be
managed on a "per table" basis as a better solution.

Again, what existing software demands per-table OID field? Isn't it what
primary keys are for?

In reality, a 32 bit OID, even isolated per table, may be too small.
Databases are getting HUGE. 40G disk drives are less than $100 bucks,
in a few months 80G drives will be less than $200, one can put
together 200G RAID systems for about $1000, a terabyte for about
$5000. A database that would have needed an enterprise level system,
just 7 years ago, can be run on a $500 desktop today.

If its too small for you, make a serial8 datatype (or something like
this), and use it for your tables. For me, I have tables which have very
few fields, and I don't want to waste 4 bytes/row (much less 8) for OID.

#45Hannu Krosing
hannu@tm.ee
In reply to: Zeugswetter Andreas SB SD (#41)
Re: Re: AW: Re: OID wraparound: summary and proposal

mlw wrote:

Zeugswetter Andreas SB SD wrote:

It seems to me, I guess and others too, that the OID mechanism should

be on a

per table basis. That way OIDs are much more likely to be unique, and

TRUNCATE

on a table should reset it's OID counter to zero.

Seems to me, that this would be no different than a performance improved
version of SERIAL.
If you really need OID, you imho want the systemid tableid tupleid
combo.

having such an global_oid fits nicely with having table-uniqe oids.

just do

select 'mysite.'||text(tableoid)||'.'||text(oid) as global_oid from
mytable;

to get it

I don't care about having an OID or ROWID, I care that there is a 2^32 limit to
the current OID strategy and that a quick fix of allowing tables to exist
without OIDs may break some existing software. I was suggesting the OIDs be
managed on a "per table" basis as a better solution.

Now that we have tableoid the need of globally unique oid is much
diminished.

In reality, a 32 bit OID, even isolated per table, may be too small. Databases
are getting HUGE. 40G disk drives are less than $100 bucks, in a few months 80G
drives will be less than $200, one can put together 200G RAID systems for about
$1000, a terabyte for about $5000. A database that would have needed an
enterprise level system, just 7 years ago, can be run on a $500 desktop today.

And my PalmPilot has more memory adn storage and processor power than
PDP-11
where UNIX was developed ;)

So the real solution will be going to 64-bit OID's and XIDS, just that
some
platforms (I'd like to know which) dont have a good "long long"
implementation yet;

------------------
Hannu

#46Hannu Krosing
hannu@tm.ee
In reply to: Zeugswetter Andreas SB SD (#41)
Re: AW: Re: OID wraparound: summary and proposal

Zeugswetter Andreas SB SD wrote:

It seems to me, I guess and others too, that the OID mechanism should

be on a

per table basis. That way OIDs are much more likely to be unique, and

TRUNCATE

on a table should reset it's OID counter to zero.

Seems to me, that this would be no different than a performance improved
version
of SERIAL.
If you really need OID, you imho want the systemid tableid tupleid
combo.

or (systemid.tableid.tupleid.versioninterval) if you want to be able to
time-travel

---------------
Hannu

#47Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Alex Pilosov (#44)
RE: Re: AW: Re: OID wraparound: summary and proposal

-----Original Message-----
From: Alex Pilosov

On Mon, 6 Aug 2001, mlw wrote:

Zeugswetter Andreas SB SD wrote:

It seems to me, I guess and others too, that the OID

mechanism should

be on a

per table basis. That way OIDs are much more likely to be

unique, and

TRUNCATE

on a table should reset it's OID counter to zero.

Seems to me, that this would be no different than a

performance improved

version
of SERIAL.
If you really need OID, you imho want the systemid tableid tupleid
combo.
A lot of people seem to use OID, when they really could use XTID. That
is
what I wanted to say.

I don't care about having an OID or ROWID, I care that there is

a 2^32 limit to

the current OID strategy and that a quick fix of allowing

tables to exist

without OIDs may break some existing software. I was suggesting

the OIDs be

managed on a "per table" basis as a better solution.

Again, what existing software demands per-table OID field? Isn't it what
primary keys are for?

I was just about to implement updatable cursors in psqlODBC using
TID and OID. I've half done it but the rest is pending now. I've had the
the plan since I introduced Tid scan in 7.0.

regards,
Hiroshi Inoue

#48Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alex Pilosov (#44)
Re: OID wraparound: summary and proposal

Hmm, this has proven more contentious than I expected ;-). It seems the
one thing that absolutely everybody agrees on is that 4-byte OIDs are no
longer workable as global identifiers.

My feeling after reading the discussions is that the best way to go
in the long run is to change from a database-wide OID generator to
per-table OID generators, and to say that if you want a database-wide
unique identifier then you should use <table oid, row oid> as that
identifier. If you want cluster-wide or universe-wide uniqueness then
you stick additional fields on the front of that. Unique IDs formed
in this way are a lot more useful than IDs taken from a simple global
sequence, because you can use the subfields to determine where to look
for the object.

If OID remains at 4 bytes then this still isn't very satisfactory for
tables that are likely to have more than 4 billion INSERTs in their
lifetime. However, rather than imposing the cost of 8-byte OIDs
everywhere, I'd be inclined to say that people who need unique
identifiers in such tables should use user-defined columns generated
from int8 sequences. (Obviously it would help if we created an
int8-based sequence type... but that's certainly doable.) Perhaps in
another few years, when all portability and performance issues with int8
are history, we could think about changing OID to 8 bytes everywhere;
but I don't think that's a good idea just yet.

I do not think it is feasible to try to implement per-table OID
generation for 7.2. What I'd like to do for 7.2 is continue with
my previous proposal of making OID generation optional on a per-table
basis (but the default is still to generate them). This seems to fit
well with an eventual migration to per-table OIDs, since it still seems
to me that some tables don't need them at all --- particularly, tables
that are using an int8 column as key because wraparound is expected.
Also, I will change pg_description as previously discussed, since this
is clearly necessary in a per-table-OID world.

Comments, objections?

regards, tom lane

#49mlw
markw@mohawksoft.com
In reply to: Hiroshi Inoue (#47)
Re: Re: AW: Re: OID wraparound: summary and proposal

Could we modify the Relation structure to hold an Oid counter? So every where
Postgres calls "newoid(void)" it gets changed to pass the relation structure it
will be associated with, i.e. newoid(Relation *). That way, every relation
could have its own counter, AND perhaps its own spinlock. Relations are shared
amongst the various processes, correct? If you pass NULL as the relation, you
get an OID out of the ShmemVariableCache->nextXid.

Am I being overly simplistic?

#50Tom Lane
tgl@sss.pgh.pa.us
In reply to: mlw (#49)
Re: Re: AW: Re: OID wraparound: summary and proposal

mlw <markw@mohawksoft.com> writes:

Am I being overly simplistic?

Yes. For one thing, Relation structs are *not* shared, nor even
persistent (the relcache will happily discard them). For another, you
haven't mentioned how we keep the counter up-to-date across system
restarts.

regards, tom lane

#51Hannu Krosing
hannu@tm.ee
In reply to: Alex Pilosov (#44)
Re: OID wraparound: summary and proposal

Tom Lane wrote:

If OID remains at 4 bytes then this still isn't very satisfactory for
tables that are likely to have more than 4 billion INSERTs in their
lifetime. However, rather than imposing the cost of 8-byte OIDs
everywhere, I'd be inclined to say that people who need unique
identifiers in such tables should use user-defined columns generated
from int8 sequences. (Obviously it would help if we created an
int8-based sequence type... but that's certainly doable.) Perhaps in
another few years, when all portability and performance issues with int8
are history, we could think about changing OID to 8 bytes everywhere;
but I don't think that's a good idea just yet.

Which are those platforms that currently lack 8-byte ints or whose
8-byte ints are limited to values below 2^31 ?

Managing huge tables on such platforms seems to be quite hard anyway .

I guess that the change of OID from 4 to 8 bytes could be carried out as
a
compile time option ?

I do not think it is feasible to try to implement per-table OID
generation for 7.2. What I'd like to do for 7.2 is continue with
my previous proposal of making OID generation optional on a per-table
basis (but the default is still to generate them). This seems to fit
well with an eventual migration to per-table OIDs, since it still seems
to me that some tables don't need them at all --- particularly, tables
that are using an int8 column as key because wraparound is expected.
Also, I will change pg_description as previously discussed, since this
is clearly necessary in a per-table-OID world.

Changing pg_description to (table_oid,row_oid) seems reasonable for
other
reasons too, like going from description to the describee. I dont think
that pg_attribute is such a heavy OID-eater, except perhaps in case
where
each transaction creates and destroys temporary tables with very high
number of columns.

-----------------
Hannu

#52Hannu Krosing
hannu@tm.ee
In reply to: Hiroshi Inoue (#47)
Re: Re: AW: Re: OID wraparound: summary and proposal

Tom Lane wrote:

mlw <markw@mohawksoft.com> writes:

Am I being overly simplistic?

Yes. For one thing, Relation structs are *not* shared, nor even
persistent (the relcache will happily discard them).

Will it be easier to make Relation shared and persistent or creating
a new shared structure that has just a counter+lock for each
relation oid ?

For another, you
haven't mentioned how we keep the counter up-to-date across system
restarts.

Perhaps write it to database at checkpoints and get the last INSERTED
record
from WAL at restart ?

Probably too simplistic as well ;)

#53Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tom Lane (#1)
Re: OID wraparound: summary and proposal

Based on the discussion so far, here is an attempt to flesh out the
details of what to do with OIDs for 7.2:

1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE.
The default behavior will be WITH OIDS.

What about having an additional Oid generator which solely serves for
supplying user tables' per row Oids? It seems relatively easy to
implement, comparing with 64-bit Oids or Oid-less tables. I assume
that the Oid wraparound problem is not so serious with user tables.
--
Tatsuo Ishii

#54Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#51)
Re: OID wraparound: summary and proposal

Hannu Krosing <hannu@tm.ee> writes:

I guess that the change of OID from 4 to 8 bytes could be carried out
as a compile time option ?

Not unless you like the notion that the wire protocol depends on a
compile time option.

regards, tom lane

#55Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#52)
Re: Re: AW: Re: OID wraparound: summary and proposal

Hannu Krosing <hannu@tm.ee> writes:

Will it be easier to make Relation shared and persistent or creating
a new shared structure that has just a counter+lock for each
relation oid ?

The latter. Relation (by which I mean a whole relcache entry with all
its subsidiary structure, not only struct RelationData) is too large,
complex and heavyweight a structure to be a good candidate for moving
into shared memory. It also contains a lot of backend-local status
data in its current incarnation.

Some kind of shared cache for sequence generators (essentially,
generalizing the existing shared OID counter into N counters) is
probably the answer. But it would have to be a cache, not the whole
truth, so there'd need to be an underlying table that holds counters not
currently swapped into cache. That part we don't have a good model for
in the existing OID-generator code, nor in the existing sequence code.

regards, tom lane

#56Fernando Nasser
fnasser@redhat.com
In reply to: Alex Pilosov (#44)
Re: OID wraparound: summary and proposal

Tom,

If we have WITH NOOID, why not having a WITH OID32 and WITH OID64 (or
something of a sort)
as well (being OID32 the default and OID an alias to it)?
The last would not be available on some systems
(who will use a system that does not support long long as a database
server anyway?)

The wire protocol will always handle the (tableoid) long form,
references will always store
the long form... The OID32 would exist only to allow people to save
space in tables that need
OIDs but not the 64 bit version.

--
Fernando Nasser
Red Hat Canada Ltd. E-Mail: fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario M4P 2C9

#57Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tatsuo Ishii (#53)
Re: OID wraparound: summary and proposal

Based on the discussion so far, here is an attempt to flesh out the
details of what to do with OIDs for 7.2:

1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE.
The default behavior will be WITH OIDS.

What about having an additional Oid generator which solely serves for
supplying user tables' per row Oids? It seems relatively easy to
implement, comparing with 64-bit Oids or Oid-less tables. I assume
that the Oid wraparound problem is not so serious with user tables.

This is a very interesting idea. Have two oid counters, one for system
tables and another for user tables. It isolates problems with oid
wraparound caused by large user tables.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#58Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fernando Nasser (#56)
Re: OID wraparound: summary and proposal

Fernando Nasser <fnasser@redhat.com> writes:

The wire protocol will always handle the (tableoid) long form,

I think you are handwaving away what is precisely the most painful
aspect. To allow 64-bit type OIDs in the wire protocol, we must
(a) have a protocol version jump, and (b) force all servers and all
client libraries to be 64-bit-capable. While I'm prepared to think
that "int8 is really only 32 bits wide" is tolerable within a single
server installation, I really don't want to deal with such headaches
between clients and servers. Can you imagine how hard it will be
to track down a bug that arises because one old client is dropping
the high-order bits of type OIDs? Only installations that had been
up for years would ever see a problem; how likely is it that anyone
would even remember that some of their clients were not 64-bit-ready?

When we're ready to make that jump, I think we should just move to
64 bit OIDs, full stop, no exceptions, no turning back, no "configure
time option", no backwards compatibility with old clients. Anything
else is a time bomb. I'd even be inclined to start running the OID
counter at 4-billion-plus-1, to help flush out anyplace that drops the
high half.

regards, tom lane

#59Hannu Krosing
hannu@tm.ee
In reply to: Alex Pilosov (#44)
Re: OID wraparound: summary and proposal

Tom Lane wrote:

Hannu Krosing <hannu@tm.ee> writes:

I guess that the change of OID from 4 to 8 bytes could be carried out
as a compile time option ?

Not unless you like the notion that the wire protocol depends on a
compile time option.

That could be a separate option, perhaps even a runtime one.

And yet another flag for determining weather to raise an error on
wire-oid
overflow or just to masquerade it as rollower ;)

--------------
Hannu

#60Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#57)
Re: OID wraparound: summary and proposal

Bruce Momjian <pgman@candle.pha.pa.us> writes:

This is a very interesting idea. Have two oid counters, one for system
tables and another for user tables. It isolates problems with oid
wraparound caused by large user tables.

Well, it'd keep user-space wraparound from affecting the system tables,
but given that the system tables have adequate defenses already (ie,
unique indexes) I'm not sure that there's any point. It'd not improve
the picture for user-table OID uniqueness by any measurable degree.

regards, tom lane

#61Fernando Nasser
fnasser@cygnus.com
In reply to: Alex Pilosov (#44)
Re: OID wraparound: summary and proposal

Tom Lane wrote:

Fernando Nasser <fnasser@redhat.com> writes:

The wire protocol will always handle the (tableoid) long form,

I think you are handwaving away what is precisely the most painful
aspect. To allow 64-bit type OIDs in the wire protocol, we must
(a) have a protocol version jump, and (b) force all servers and all
client libraries to be 64-bit-capable. While I'm prepared to think
that "int8 is really only 32 bits wide" is tolerable within a single
server installation, I really don't want to deal with such headaches
between clients and servers. Can you imagine how hard it will be
to track down a bug that arises because one old client is dropping
the high-order bits of type OIDs? Only installations that had been
up for years would ever see a problem; how likely is it that anyone
would even remember that some of their clients were not 64-bit-ready?

A protocol bump is inevitable if we ever want to deal with 64 bit OIDs,
so the sooner we do it the better.

Someone pointed out that even with optional OIDs and per table OIDs,
we would still need to allow per table OIDs to be more than 32 bits
(I am taking his word for it). If that is the case, the scenario you
described above is inevitable.

When we're ready to make that jump, I think we should just move to
64 bit OIDs, full stop, no exceptions, no turning back, no "configure
time option", no backwards compatibility with old clients. Anything
else is a time bomb. I'd even be inclined to start running the OID
counter at 4-billion-plus-1, to help flush out anyplace that drops the
high half.

That would be the way to go. We are just trying to buy some time with
the other measures.

But some folks are complaining of having to use 64 bit OIDs when they
don't really need them, so that is why I proposed the OID32/OID64 option.

--
Fernando Nasser
Red Hat - Toronto E-Mail: fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario M4P 2C9

#62Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tom Lane (#60)
Re: OID wraparound: summary and proposal

Well, it'd keep user-space wraparound from affecting the system tables,
but given that the system tables have adequate defenses already (ie,
unique indexes) I'm not sure that there's any point. It'd not improve
the picture for user-table OID uniqueness by any measurable degree.

But from the point of users' view, it does not prevent "create XXX
comand fails due to Oid wraparounding" problems, no?

Also I am worried about the performance of the per table Oid
generators. Even the system tables going to have that kind of
generators? What would happend if there are 5k tables in a database?
It's not very rare situation in a large installation.
--
Tatsuo Ishii

#63Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tatsuo Ishii (#62)
Re: OID wraparound: summary and proposal

Tatsuo Ishii <t-ishii@sra.co.jp> writes:

Also I am worried about the performance of the per table Oid
generators.

I think performance would be a big problem if we tried to implement them
just like sequences are done now. But a shared hashtable of Oid
generators (each one handled roughly like the single Oid generator
currently is) would probably work okay. We'd have to work out how to
have a backing disk table for this hashtable, since we couldn't expect
to have room in shared memory for all generators at all times --- but we
could cache all the active generators in shared memory, I'd think.

regards, tom lane

#64mlw
markw@mohawksoft.com
In reply to: Bruce Momjian (#57)
Re: OID wraparound: summary and proposal

Tom Lane wrote:

Tatsuo Ishii <t-ishii@sra.co.jp> writes:

Also I am worried about the performance of the per table Oid
generators.

I think performance would be a big problem if we tried to implement them
just like sequences are done now. But a shared hashtable of Oid
generators (each one handled roughly like the single Oid generator
currently is) would probably work okay. We'd have to work out how to
have a backing disk table for this hashtable, since we couldn't expect
to have room in shared memory for all generators at all times --- but we
could cache all the active generators in shared memory, I'd think.

Maybe I'm confused, is there no shared memory for each unique table in use?
If so, couldn't the Oid generator be stored there? if not, how does that
work?

Second, IMHO I think you are a bit too conservative with shared memory. If
one has so many active tables that their Oid generators wouldn't fit in
shared memory, this would indicate a fairly large database, I think one
could be justified in requiring more resources than the minimum. PostgreSQL
is already increasing in resource requirements. The introduction of WAL
added a lot of disk space for operation. A few K of shared RAM doesn't seem
like a lot. (Maybe I am jaded as I have bumped my shared memory to 128M)

Lastly, were PostgreSQL to have multiple Oid generators, each of these
could have its own spinlock or mutex, thus reducing competition. In an
active system with activity on multiple tables, this could improve
performance.

#65Jan Wieck
JanWieck@Yahoo.com
In reply to: Tom Lane (#63)
Re: OID wraparound: summary and proposal

Tom Lane wrote:

Tatsuo Ishii <t-ishii@sra.co.jp> writes:

Also I am worried about the performance of the per table Oid
generators.

I think performance would be a big problem if we tried to implement them
just like sequences are done now. But a shared hashtable of Oid
generators (each one handled roughly like the single Oid generator
currently is) would probably work okay. We'd have to work out how to
have a backing disk table for this hashtable, since we couldn't expect
to have room in shared memory for all generators at all times --- but we
could cache all the active generators in shared memory, I'd think.

Keep also in mind that actually the uniqueness of Oid's
across all tables is used by TOAST to determine that a
toasted value found in the new tuple is the same than in the
old one on heap_update() or not. If we go for a separate Oid
per table, an UPDATE with a subselect from another table
could get misinterpreted in the toaster, not duplicating the
value but referencing the external value in another tables
toast-shadow table.

It's no big deal, some additional checks of the va_toastrelid
beeing the same as the target relations toast relation should
do it.

Now since toast needs the row Oid allways, I think the idea
of making Oid's in user tables optional is dead.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#66Hannu Krosing
hannu@tm.ee
In reply to: Alex Pilosov (#44)
Re: OID wraparound: summary and proposal

Tom Lane wrote:

Fernando Nasser <fnasser@redhat.com> writes:

The wire protocol will always handle the (tableoid) long form,

I think you are handwaving away what is precisely the most painful
aspect. To allow 64-bit type OIDs in the wire protocol, we must
(a) have a protocol version jump, and (b) force all servers and all
client libraries to be 64-bit-capable. While I'm prepared to think
that "int8 is really only 32 bits wide" is tolerable within a single
server installation, I really don't want to deal with such headaches
between clients and servers. Can you imagine how hard it will be
to track down a bug that arises because one old client is dropping
the high-order bits of type OIDs?

When I thought of it, my solution was to issue a NOTICE on each and
very OID truncation - they should be visible enough to force upgrade ;)

Show quoted text

Only installations that had been
up for years would ever see a problem; how likely is it that anyone
would even remember that some of their clients were not 64-bit-ready?

When we're ready to make that jump, I think we should just move to
64 bit OIDs, full stop, no exceptions, no turning back, no "configure
time option", no backwards compatibility with old clients. Anything
else is a time bomb. I'd even be inclined to start running the OID
counter at 4-billion-plus-1, to help flush out anyplace that drops the
high half.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#67Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Wieck (#65)
Re: OID wraparound: summary and proposal

Jan Wieck <JanWieck@yahoo.com> writes:

Keep also in mind that actually the uniqueness of Oid's
across all tables is used by TOAST to determine that a
toasted value found in the new tuple is the same than in the
old one on heap_update() or not.
It's no big deal, some additional checks of the va_toastrelid
beeing the same as the target relations toast relation should
do it.

Good point.

Now since toast needs the row Oid allways, I think the idea
of making Oid's in user tables optional is dead.

Why? I see where it's looking at the main-row OID and attno to decide
if it's the same value or not, but this seems strange and wrong. Why
doesn't it just compare va_toastrelid and va_valueid? Considering that
the main point of this comparison is to distinguish values associated
with different versions of the same row, neither main row OID nor
attribute number seem helpful. I don't see why we expend space on
storing va_rowid + va_attno at all.

BTW, I've already completed implementing optional OIDs, so I'm not
going to give up the idea lightly at this point ;-)

regards, tom lane

#68mlw
markw@mohawksoft.com
In reply to: Zeugswetter Andreas SB SD (#39)
Re: AW: Re: OID wraparound: summary and proposal

Somehow I guess I created a misunderstanding. I don't really care about
ROWID. I care that OID is a 32 bit number. The notion that each table could
have its own "OID" similar to a ROWID could be an intermediate solution. I
have flip-flopped a couple times about whether or not the OID being able to
be eliminated from some tables is a good idea. Some code depends on the
OID.

I have hit OID problems personally. To be honest I think it can be a huge
problem. As I have said, 40G disks are under $100. Just a few years ago a
40G storage system would have cost $20K-$30K. BIG databases are being
created today, which wouldn't have been funded just a few years ago. At my
company we have an aggregated database of 3 distinctly large databases, and
hit a bug in large OID numbers in 7.0.3.

The way I see it there are 4 options for the OID:
(1) Keep OID handling as it is. I think everyone agrees that this is not an
option.
(2) Allow the ability to have tables without OIDs. This is a source of
debate.
(3) Allow tables to have their own notion of an OID. This is harder to do,
and also a source of debate.
(4) Make OIDs 64 or 128 bit. (there are platform issues.)

Show quoted text

Some other databases have the notion of a ROWID which uniquely

identifies a row

within a table. OID can be used for that, but it means if you use it,

you must

limit the size of your whole database system.

Imho that is getting it all wrong. OID is *not* a suitable substitute
for other
db's ROWID.

If you take a few extra precautions then you can use XTID in PostgreSQL
instead of other's ROWID.

We often hear, that it is safer to use ROWID in Oracle and Informix than
in
PostgreSQL. It is only true that the risc of getting at the wrong record
is
lower. Are you going to take chances when manipulating rows ? NO !
Thus any sensible program working on ROWID's will have builtin
precautions,
like locking the table, or using additional where quals.

I am still of the opinion, that we should invent an alias ROWID at the
SQL level
for the current XTID. I do not think that it matters what datatype this
ROWID is,
an arbitrary string like xtid is sufficient, it does not need to be an
integer.

Andreas

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

#69Matthew T. O'Connor
matthew@zeut.net
In reply to: Zeugswetter Andreas SB SD (#39)
Help with Vacuum Failure

Hello, I'm having a problem vacuum a table and I didn't see an answer using
the fts engine.

I have two questions:

1) Is this a big problem, can it be fixed, do I have to dump / restore this
table?
2) I found this problem from my nightly cron driven vacuum -a -z. When it
hits this error the entire vacuumdb process stops immediately thus skipping
any remaining databases. Should it do this? Or should it continue on and
vacuum the other databases?

Here is the error:

cms_beau=# vacuum hits; (It works without the analyze phase of backup.)
VACUUM
cms_beau=# VACUUM verbose analyze hits;
NOTICE: --Relation hits--
NOTICE: Pages 8389: Changed 0, reaped 2, Empty 0, New 0; Tup 834575: Vac 0,
Keep/VTL 4/4, Crash 0, UnUsed 6, MinLen 52, MaxLen 121; Re-using:
Free/Avail. Space 376/64; EndEmpty/Avail. Pages 0/1. CPU 0.34s/0.05u sec.
NOTICE: Index hits_id_key: Pages 1831; Tuples 834575: Deleted 0. CPU
0.11s/0.56u sec.
NOTICE: Rel hits: Pages: 8389 --> 8389; Tuple(s) moved: 0. CPU 0.00s/0.00u
sec.
NOTICE: --Relation pg_toast_6742393--
NOTICE: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail.
Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE: Index pg_toast_6742393_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
NOTICE: Analyzing...
ERROR: MemoryContextAlloc: invalid request size 4294079565
cms_beau=#

Additional information:

sort_mem = 16384
shared_buffers = 8192

cms_beau=# select version();
version
-------------------------------------------------------------
PostgreSQL 7.1.2 on i686-pc-linux-gnu, compiled by GCC 2.96
(1 row)

cms_beau=# \d hits
                                      Table "hits"
  Attribute  |           Type           |                   Modifier
-------------+--------------------------+-----------------------------------
------------
 id          | integer                  | not null default
nextval('hits_id_seq'::text)
 operator_id | integer                  |
 connected   | timestamp with time zone | default 'now'
 page        | text                     |
Index: hits_id_key

cms_beau=# select count(*) from hits;
count
--------
834539
(1 row)

Please let me know if there is any other information you need.

Thank you much,

Matt O'Connor

#70Neil Padgett
npadgett@redhat.com
In reply to: Zeugswetter Andreas SB SD (#39)
Re: Re: AW: Re: OID wraparound: summary and proposal

mlw wrote:

The way I see it there are 4 options for the OID:

[snip]

(2) Allow the ability to have tables without OIDs. This is a source of
debate.

I think Tom Lane has already committed some patches to allow for this.
So, I think you should be able to try this from the latest CVS. (Tom?)

Neil

--
Neil Padgett
Red Hat Canada Ltd. E-Mail: npadgett@redhat.com
2323 Yonge Street, Suite #300,
Toronto, ON M4P 2C9

#71Doug McNaught
doug@wireboard.com
In reply to: Zeugswetter Andreas SB SD (#39)
Re: Re: AW: Re: OID wraparound: summary and proposal

mlw <markw@mohawksoft.com> writes:

Somehow I guess I created a misunderstanding. I don't really care about
ROWID. I care that OID is a 32 bit number. The notion that each table could
have its own "OID" similar to a ROWID could be an intermediate solution. I
have flip-flopped a couple times about whether or not the OID being able to
be eliminated from some tables is a good idea. Some code depends on the
OID.

See below...

The way I see it there are 4 options for the OID:

(2) Allow the ability to have tables without OIDs. This is a source of
debate.

If we do this, and default OIDs to "on", honestly, where's the
problem? If the DBA does nothing, things work as before (with
potential OID wraparound issues). If you want to avoid/minimize the
issues, turn off OIDs on your large tables, and write/fix your code to
cope.

(3) Allow tables to have their own notion of an OID. This is harder to do,
and also a source of debate.
(4) Make OIDs 64 or 128 bit. (there are platform issues.)

(5) [this was suggested earlier] Create separate spaces for "system"
and "user" OIDs. This requires a similar mechanism to (3), but may be
somewhat easier.

-Doug
--
Free Dmitry Sklyarov!
http://www.freesklyarov.org/

We will return to our regularly scheduled signature shortly.

#72Tom Lane
tgl@sss.pgh.pa.us
In reply to: Neil Padgett (#70)
Re: Re: AW: Re: OID wraparound: summary and proposal

Neil Padgett <npadgett@redhat.com> writes:

mlw wrote:

The way I see it there are 4 options for the OID:

[snip]

(2) Allow the ability to have tables without OIDs. This is a source of
debate.

I think Tom Lane has already committed some patches to allow for this.
So, I think you should be able to try this from the latest CVS. (Tom?)

Yes, it's done and in CVS. I think this is orthogonal to the other
proposals: whatever we want to do with OID, it's a useful feature to
be able to suppress them for tables that you're sure don't need one.

I thought the discussion had more or less concluded that separate-OID-
generator-per-table was the next step to take. That won't get done in
time for 7.2, though.

regards, tom lane

#73Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthew T. O'Connor (#69)
Re: Help with Vacuum Failure

"Matthew T. O'Connor" <matthew@zeut.net> writes:

cms_beau=# vacuum hits; (It works without the analyze phase of backup.)
VACUUM
cms_beau=# VACUUM verbose analyze hits;
NOTICE: --Relation hits--
NOTICE: Pages 8389: Changed 0, reaped 2, Empty 0, New 0; Tup 834575: Vac 0,
Keep/VTL 4/4, Crash 0, UnUsed 6, MinLen 52, MaxLen 121; Re-using:
Free/Avail. Space 376/64; EndEmpty/Avail. Pages 0/1. CPU 0.34s/0.05u sec.
NOTICE: Index hits_id_key: Pages 1831; Tuples 834575: Deleted 0. CPU
0.11s/0.56u sec.
NOTICE: Rel hits: Pages: 8389 --> 8389; Tuple(s) moved: 0. CPU 0.00s/0.00u
sec.
NOTICE: --Relation pg_toast_6742393--
NOTICE: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail.
Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE: Index pg_toast_6742393_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
NOTICE: Analyzing...
ERROR: MemoryContextAlloc: invalid request size 4294079565
cms_beau=#

This looks like you have corrupted data in your table --- specifically,
a variable-length value with a bogus length word. If so, you'll get a
similar error during any attempt to access the particular value or row
that's corrupted. A quick check of this theory is to try to pg_dump
the table --- if it fails with the same sort of error, then you have
a problem.

PostgreSQL 7.1.2 on i686-pc-linux-gnu, compiled by GCC 2.96

2.96? AFAICT 2.95.3 is the latest official release of GCC.

regards, tom lane

#74Dale Johnson
dale@zembu.com
In reply to: Doug McNaught (#71)
Re: AW: Re: OID wraparound: summary and proposal

The way I see it there are 4 options for the OID:

What about a vacuum analyze for the database that renumbers theOIDs
back at some baseline? There is still a limitation on the total number
of active rows in the database (0.5 * 2^32), but at least we wouldn't
have this timebomb.

Dale Johnson