newoid in invapi.c

Started by Maurice Gittensalmost 28 years ago8 messages

Maurice Gittens

mgittens@gits.nl

almost 28 years ago

Hi,

In the file large_object/inv_api.c there is a statement in the function
inv_create
which goes:

file_oid=newoid() + 1;

later on a heap_create_with_catalog call is performed to create a heap
for the large object called xinv<file_oid>.

According to code (and the comments in the code) the assumption is that the
oid
of the heap_relation will be equal to the value of the variable file_oid.

This of course will only be the case if nobody else called newoid()
before the heap relation is created.

This might lead the large object implementation to confuse
large object relations with other relations.

According to me this is a bug. I'm I right?

Thanks,
Maurice

Peter T Mount

psqlhack@maidast.demon.co.uk

almost 28 years ago

In reply to: Maurice Gittens (#1)

Re: [HACKERS] newoid in invapi.c

On Fri, 6 Mar 1998, Maurice Gittens wrote:

Hi,

In the file large_object/inv_api.c there is a statement in the function
inv_create
which goes:

file_oid=newoid() + 1;

later on a heap_create_with_catalog call is performed to create a heap
for the large object called xinv<file_oid>.

According to code (and the comments in the code) the assumption is that the
oid
of the heap_relation will be equal to the value of the variable file_oid.

This of course will only be the case if nobody else called newoid()
before the heap relation is created.

This might lead the large object implementation to confuse
large object relations with other relations.

According to me this is a bug. I'm I right?

Yes, and no. LargeObjects are supposed to run within a transaction (if you
don't then some fun things happen), and (someone correct me if I'm wrong)
if newoid() is called from within the transaction, it is safe?

--
Peter T Mount petermount@earthling.net or pmount@maidast.demon.co.uk
Main Homepage: http://www.demon.co.uk/finder
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk

Maurice Gittens

mgittens@gits.nl

almost 28 years ago

In reply to: Peter T Mount (#2)

Re: [HACKERS] newoid in invapi.c

-----Original Message-----
From: Peter T Mount <psqlhack@maidast.demon.co.uk>
To: Maurice Gittens <mgittens@gits.nl>
Cc: PostgreSQL-development <hackers@postgreSQL.org>
Date: maandag 9 maart 1998 3:34
Subject: Re: [HACKERS] newoid in invapi.c

This might lead the large object implementation to confuse
large object relations with other relations.

According to me this is a bug. I'm I right?

Yes, and no. LargeObjects are supposed to run within a transaction (if you
don't then some fun things happen), and (someone correct me if I'm wrong)
if newoid() is called from within the transaction, it is safe?

I see no evidence in the code that suggests that it is safe in transactions.
The GetNewObjectIdBlock() function which generates the OID blocks _does_
acquire a spinlock before it generates a new block of oids so usually all
will be well.
But sometimes ((a chance of <usercount>/32) when there <usercount> active
users
for the same db) the newoid might have a quite different value than
fileoid+1.

Again I see no evidence in the code that it is safe in transactions. I only
see evidence that it will _usually_ work.

Actually I wonder how it could be efficiently made safe within transactions
given
that the oids generated are guaranteed to be unique within an
_entire_ postgres installation. This would seem to imply that, effectively,
only one transaction would be possible at the same time in an entire
postgresql database.

My current strategy to solve this problem involves the use of a new
system catalog which I call pg_large_object. This catalog contains
information about each large object in the system.
Currently the information maintained is:
- identification of heap and index relations used by the large_object
- the size of the large object
- information about the type of the large object.
I still need to figure out how to create a new _unique_ index on a system
catalog using information in the indexing.h file.

Given an oid this table allow us to determine if it is a valid large object.
I think this is necesary (to be able to maintain referential integrity) if
we're ever
going to have large object type.

Similarly I have defined a table pg_tuple which allows one to
determine if a given oid is a valid tuple.
This together with some other minor changes allows some cool
object oriented features for postgresql.

Fancy the idea of persistent Java object which live in postgresql databases?

Anyway if it all works as expected I'll submit some patches.

Thanks,
Maurice

Import Notes

Resolved by subject fallback

Zeugswetter Andreas

andreas.zeugswetter@telecom.at

almost 28 years ago

In reply to: Maurice Gittens (#3)

AW: [HACKERS] newoid in invapi.c

This might lead the large object implementation to confuse
large object relations with other relations.

According to me this is a bug. I'm I right?

Yes, and no. LargeObjects are supposed to run within a transaction (if you
don't then some fun things happen), and (someone correct me if I'm wrong)
if newoid() is called from within the transaction, it is safe?

Again I see no evidence in the code that it is safe in transactions. I only
see evidence that it will _usually_ work.

yes, but currently it is very hard to produce this behavior, since we still only have table locks.
You would need more than 32 lob tables, accessed concurrently (not sure on that) ?
This area has to be specially watched when page, or row locks get implemented.

I think this is why a lot of us (hands up) want to reduce the use of oid's in user tables,
user tables would only have oid's iff the table is created with 'with oid'.
Per default normal user tables would not have oid's. I strongly support this as a strategy.

My current strategy to solve this problem involves the use of a new
system catalog which I call pg_large_object. This catalog contains
information about each large object in the system.

hmmm ... another bottleneck ? one table many users ?

Currently the information maintained is:
- identification of heap and index relations used by the large_object
- the size of the large object
- information about the type of the large object.
I still need to figure out how to create a new _unique_ index on a system
catalog using information in the indexing.h file.

I would propose a strategy, where the large object is referenced by a physical position (ctid)
and is stored in one file per lob column. You have to always remember, that filesystems
only behave well if they have less than xx members per directory xx usually beeing between 1000 - 25000
More members per directory will get file stat times of 20 ms and more, not to forget about
the many open files. While it is hard to have 20000+ tables it is easy to have millions of rows,
definitely too much for one directory file (this is not OS specific).
I would also suggest to hard link large objects to an owning row. Meaning, if the row is deleted
the lob is also deleted. I would not make this a trigger issue at the user, or type programmer level,
but handle it generically in the backend. Writing a lob type is hard enough to not make it
even more complex.

Given an oid this table allow us to determine if it is a valid large object.
I think this is necesary (to be able to maintain referential integrity) if
we're ever
going to have large object type.

Similarly I have defined a table pg_tuple which allows one to
determine if a given oid is a valid tuple.

please remember, that a la long not all user tuples can have oids. This would always be
a major performance problem.

This together with some other minor changes allows some cool
object oriented features for postgresql.

Yes, definitely. I don't know how to resolve my inner conflict on the two seemingly contrary issues,
performance versus OO features.

Fancy the idea of persistent Java object which live in postgresql databases?

Anyway if it all works as expected I'll submit some patches.

Thanks,
Maurice

Andreas

Import Notes

Resolved by subject fallback

Maurice Gittens

mgittens@gits.nl

almost 28 years ago

In reply to: Zeugswetter Andreas (#4)

Re: [HACKERS] newoid in invapi.c

-----Original Message-----
From: Zeugswetter Andreas <andreas.zeugswetter@telecom.at>
To: 'Maurice Gittens' <mgittens@gits.nl>; 'pgsql-hackers@hub.org'
<pgsql-hackers@hub.org>
Date: maandag 9 maart 1998 18:58
Subject: AW: [HACKERS] newoid in invapi.c

This might lead the large object implementation to confuse
large object relations with other relations.

According to me this is a bug. I'm I right?

Yes, and no. LargeObjects are supposed to run within a transaction (if you
don't then some fun things happen), and (someone correct me if I'm wrong)
if newoid() is called from within the transaction, it is safe?

I see no evidence in the code that suggests that it is safe in

transactions.

The GetNewObjectIdBlock() function which generates the OID blocks _does_
acquire a spinlock before it generates a new block of oids so usually all
will be well.
But sometimes ((a chance of <usercount>/32) when there <usercount> active
users
for the same db) the newoid might have a quite different value than
fileoid+1.

Again I see no evidence in the code that it is safe in transactions. I only
see evidence that it will _usually_ work.

yes, but currently it is very hard to produce this behavior, since we still

only have table locks.
I think this may not be true since I said <usercount> in the above and I
should have
said <connection_count>. Multiple (persistent) connections from http daemons
etc. seem likely to be common.

You would need more than 32 lob tables, accessed concurrently (not sure on

that) ?

I don't expect this to be true.

This area has to be specially watched when page, or row locks get

implemented.

Actually I wonder how it could be efficiently made safe within transactions
given
that the oids generated are guaranteed to be unique within an
_entire_ postgres installation. This would seem to imply that, effectively,
only one transaction would be possible at the same time in an entire
postgresql database.

I think this is why a lot of us (hands up) want to reduce the use of oid's

in user tables,

user tables would only have oid's iff the table is created with 'with oid'.
Per default normal user tables would not have oid's. I strongly support

this as a strategy.

According to me it is impossible to support general sematics
for object orientation without support for the notion of identity.
So maybe we'll get rid of the oid and introduce some other "thing" with
also gives us identity for instances of classes.
But IMO no identity equates to no OO. And currently identity is only
provided
by oids, so I would vote to keep them in user tables too.

I expect it wouldn't be trivial to optionally remove oids (this would be
almost a rewrite of the system as far as I have seen.)

One thing we may be able to insure is that the oids be unique in a single
database instead of in a complete postgresql installation.
I think this implies that there would be no more sharing of system
catalogs between databases. So pg_variable (or whatever it;s called)
would be a local database and would pull in all the other global
databases as well. It wouldn't be easily possible to share databases
between users any more.
This would be similar to using different postgresql installations for
different databases.

I must admit that (apart from possible performance issues) I really
like the oids.

My current strategy to solve this problem involves the use of a new
system catalog which I call pg_large_object. This catalog contains
information about each large object in the system.

hmmm ... another bottleneck ? one table many users ?

In general I believe this to be true. But how much will it cost in practice?
(and we can still be clever can't we?).

Currently the information maintained is:
- identification of heap and index relations used by the large_object
- the size of the large object
- information about the type of the large object.
I still need to figure out how to create a new _unique_ index on a system
catalog using information in the indexing.h file.

I would propose a strategy, where the large object is referenced by a

physical position (ctid)

and is stored in one file per lob column. You have to always remember, that

filesystems

only behave well if they have less than xx members per directory xx usually

beeing between 1000 - 25000

More members per directory will get file stat times of 20 ms and more, not

to forget about

the many open files. While it is hard to have 20000+ tables it is easy to

have millions of rows,

definitely too much for one directory file (this is not OS specific).
I would also suggest to hard link large objects to an owning row. Meaning,

if the row is deleted

the lob is also deleted. I would not make this a trigger issue at the user,

or type programmer level,

but handle it generically in the backend. Writing a lob type is hard enough

to not make it

even more complex.

I agree with you where you say that a new file for each large object is not
"the right thing". My experience using the large objects also confirms what
you
are saying. I believe that large objects are broken by design and
that some rethinking is needed to do it the right way.

Properly done I expect this all to be integrated with some type of PL
language
anyway. There has been talk about this and I hope this new language will
have
OO features.

Given an oid this table allow us to determine if it is a valid large

object.

I think this is necesary (to be able to maintain referential integrity) if
we're ever
going to have large object type.

Similarly I have defined a table pg_tuple which allows one to
determine if a given oid is a valid tuple.

please remember, that a la long not all user tuples can have oids. This

would always be

a major performance problem.

I do not think that this is true because an oid is a so called "system
attribute"
and all heap tuples in postgresql carry these attributes if the user knows
it or not.

On the other hand consider the benefits. IMO this would be the foundation of
a
persistent (possible OO) language which could be integrated in postgresql.
To do this we need object identity.

This together with some other minor changes allows some cool
object oriented features for postgresql.

Yes, definitely. I don't know how to resolve my inner conflict on the two

seemingly contrary issues,

performance versus OO features.

Good OO designs are very clean. Consider the implementation of postgresql.
Don't you think it is we'll designed? I see the design and implementation of
postgresql as evidence that OO _can_ be implemented using C as well.
I expect C++ would have allowed for an even cleaner implementation not
in the least because of those virtual functions.

I like postgresql for it's OO features and I hope to be able to enhance it
in this
area. Right now there are a number of fundamental OO operations that are
not implemented fully.

For instance:
1. Mapping from an oid to it's most derived class.
2. It seems that triggers are not inherited by subclasses. This would allow
for
polymorphism in postgresql. I expect that this feature by it's self
(incombination
with (1)) would make the use of inheritance more abundant.
3. Allowing for abstract classes (so that I define relations which
need not exists as files on disk but are intended as base classes for
inheritance).
For instance in the current implementation all heap tuples have the same
set of system attributes. This could be made obvious to the OO literate by
introducing an abstact class like pg_tuple (or pg_object, etc).
This class could then be extended by concrete classes or other abstract
classes.

Hmm... does CREATE ABSTRACT [CLASS|TABLE] ... look good?

4. A unified namespace for object id's;
I expect this to allow for some pretty neat features too.

5. One must be able determine if an oid represents a heap tuple object or
not.

I have an implementation of (1,5) and I'm trying to implement (4), because
I found no immediately apparant trivial implementation for (2) and (3).

Regards from Maurice.

Import Notes

Resolved by subject fallback

Zeugswetter Andreas

andreas.zeugswetter@telecom.at

almost 28 years ago

In reply to: Maurice Gittens (#5)

Re: [HACKERS] newoid in invapi.c

So maybe we'll get rid of the oid and introduce some other "thing" with
also gives us identity for instances of classes.
But IMO no identity equates to no OO. And currently identity is only
provided
by oids, so I would vote to keep them in user tables too.

In relational speak a tuple is always identified by it's primary key, which also guarantees
fast access. The where oid = <value> is only fast if the user defines an index on oid.
The extensive use of oid's is also a nightmare for all those that want to reorganize tables.
There is simply very much that speaks against the use of oid's a la long.

Illustra defines a "reference" maybe we should dig into that ?
create table person (
name char (16),
mother ref(person),
father ref(person)
)

A unique pointer to a row for me is always:
dbid + tableid + fileid + primary key (or even rowid)

I personally like the idea of a physical address as an alterntive to oid. The problem
with this is that physical position changes over time. As the past has shown the
same problem is also present for oid's. The problem could maybe be solved
with a physical position tracking system, that gets reset at vacuum time. Or maybe
the existing logic for indexes could be reused in a somewhat modified manner.

Andreas

Import Notes

Resolved by subject fallback

Maurice Gittens

mgittens@gits.nl

almost 28 years ago

In reply to: Zeugswetter Andreas (#6)

Re: [HACKERS] newoid in invapi.c

-----Original Message-----
From: Zeugswetter Andreas <andreas.zeugswetter@telecom.at>
To: 'pgsql-hackers@hub.org' <pgsql-hackers@hub.org>
Date: dinsdag 10 maart 1998 0:36
Subject: Re: [HACKERS] newoid in invapi.c

So maybe we'll get rid of the oid and introduce some other "thing" with
also gives us identity for instances of classes.
But IMO no identity equates to no OO. And currently identity is only
provided
by oids, so I would vote to keep them in user tables too.

In relational speak a tuple is always identified by it's primary key, which

also guarantees

fast access.

Yes, in relational speak it is.

The where oid = <value> is only fast if the user defines an index on oid.
The extensive use of oid's is also a nightmare for all those that want to

reorganize tables.

There is simply very much that speaks against the use of oid's a la long.

Yes I agree that the way oids are implemented now has problems.
However I choose to see these problems as "implementation details".

Illustra defines a "reference" maybe we should dig into that ?
create table person (
name char (16),
mother ref(person),
father ref(person)
)

Ok, if such a reference is unique within a table then we've got something
similar to tids if they are not they would more resemble oids.
I don't know which is the case for illustra but recalling it's
heritage...?!?

I presume that illustra would allow me to extend the above like in the
following:

create table teacher (
course char (32)
) inherits(person);

If the illustra system allows me to insert a teacher object as my
mother then the illustra reference is not likely to be implemented as
physical
reference (tid) but more likely with some logical reference (oid).

I really hope the last suggestion is the case as it would much
resemble that which I would like to see in postgresql.

A unique pointer to a row for me is always:
dbid + tableid + fileid + primary key (or even rowid)

In a relational system, yes. In a OO system not necesarily.
Because as in the example above, a lady who happens to be my mother
may also happen to be a teacher. The identity of "my mother" and the
identity of "my mother the teacher" should be the same.
It would be a pity is "my mother" would have two identities just because
of the way my database system stores it's data.

I personally like the idea of a physical address as an alterntive to oid.

The problem

with this is that physical position changes over time. As the past has

shown the

same problem is also present for oid's. The problem could maybe be solved
with a physical position tracking system, that gets reset at vacuum time.

Or maybe

the existing logic for indexes could be reused in a somewhat modified

manner.

I'm not trying to say that the physical address approach to identity is
wrong.
I'll try to explain with an example.

<EXAMPLE>
CREATE TABLE base (f1 int4);
CREATE TABLE derived (f2 int4) inherits(base);

INSERT INTO base values(10);
INSERT INTO base values(20,20);
</EXAMPLE>

For the query "SELECT <identifier>,f1 from base;" my ideal OO system
might give:

For the query "SELECT <identifier>, f1 from derived;" my ideal OO system
might give:

This of course as a result of the so-called "is_a" relation between
instances of the
derived class and instances of the base class.
So the query is polymorphic because it also operates on instances of classes
derived from the base class.
So to support polymorphism we need to have some form of identity which
is also valid between tables. As a result the current tids in postgresql
won't work because they are only valid within one table.

You'll have noticed that my "ideal" system has different semantics
than postgresql. So as far as I concerned there is room for improvement
in postgresql.

According to me in the least triggers, indices and select/update/delete
statements should be polymorphic
(should work for instances of base classes and instances of derived
classes).

Thanks, with regards from Maurice.

Import Notes

Resolved by subject fallback

Maurice Gittens

mgittens@gits.nl

almost 28 years ago

In reply to: Maurice Gittens (#7)

Re: [HACKERS] newoid in invapi.c

-----Original Message-----
From: Zeugswetter Andreas <andreas.zeugswetter@telecom.at>
To: 'pgsql-hackers' <pgsql-hackers@hub.org>
Date: dinsdag 10 maart 1998 16:26
Subject: WG: [HACKERS] newoid in invapi.c

There is simply very much that speaks against the use of oid's a la long.

Yes I agree that the way oids are implemented now has problems.
However I choose to see these problems as "implementation details".

No, here I disagree, a globally unique identifier like oid is an

architecture and strategy thing.

As such it needs a lot of thought and care.

Yes, you are right, it does.

Illustra defines a "reference" maybe we should dig into that ?
create table person (
name char (16),
mother ref(person),
father ref(person)
)

Ok, if such a reference is unique within a table then we've got something
similar to tids if they are not they would more resemble oids.
I don't know which is the case for illustra but recalling it's
heritage...?!?

Illustra uses oid's, and of course it suffers the same bottleneck on a

multi CPU System.

I presume that illustra would allow me to extend the above like in the
following:

create table teacher (
course char (32)
) inherits (person);

Yup, syntax is: create table teacher (course char(32)) under person;

If the illustra system allows me to insert a teacher object as my
mother then the illustra reference is not likely to be implemented as
physical

No, in the above schema the teacher instance gets its own identity, same as

postgresql.

What a pity.

reference (tid) but more likely with some logical reference (oid).

I really hope the last suggestion is the case as it would much
resemble that which I would like to see in postgresql.

A unique pointer to a row for me is always:
dbid + tableid + fileid + primary key (or even rowid)

In a relational system, yes. In a OO system not necesarily.
Because as in the example above, a lady who happens to be my mother
may also happen to be a teacher. The identity of "my mother" and the
identity of "my mother the teacher" should be the same.

Agreed, but currently not the case. Let me explain further:
If you have parents, teachers under parents and petowners under parents

:-).

Then the teachers that are also petowners would get 2 oid's. One for

teacher

and one for petowner. I agree that this is not perfect, and can not be

solved with the current

architecture :-(

I want to think about this for a while. Maybe there is some cleverness to be
found.

It would be a pity if "my mother" would have two identities just because
of the way my database system stores it's data.

<snip>

So to support polymorphism we need to have some form of identity which
is also valid between tables. As a result the current tids in postgresql
won't work because they are only valid within one table.

simply add the table id to the tid ?

Might work.

You'll have noticed that my "ideal" system has different semantics
than postgresql. So as far as I concerned there is room for improvement
in postgresql.

According to me in the least triggers, indices and select/update/delete
statements should be polymorphic
(should work for instances of base classes and instances of derived
classes).

The successor of Illustra, the Informix Universial Server does this.
We have a comment in the code stating that the base* should probably be the

default.

Looks like others think that way too. :-)

Doesn't the "Postgresql Universal Server" sound nice?

In short:
1. I think your work in the current direction is very valuable !
2. I still suggest to implement it in a way that leaves the door open
to not have an oid for every table/tuple per default.
3. Tables without oid would simply not have all the OO functionality.
tuples without the oid would not exist in the *OO world*
4. I think it is valuable to have both OO and fast relational stuff.

(ORDBMS)

Your points have been taken.

Thanks,
Maurice.

Import Notes

Resolved by subject fallback