[RFC] Removing "magic" oids

Started by Andres Freundover 7 years ago48 messageshackers

andres@anarazel.de

over 7 years ago

Hi,

In my opinion the current WITH OIDs system has numerous weaknesses:

1) The fact that oids are so magic means that if we get pluggable
storage, the design of the potential pluggable systems is constrained
and similar magic has to be present everywhere.

2) The fact that the oids in each table have the same counter to be
based on means that oid wraparounds have much worse consequences
performance wise than necessary. E.g. once the global counter has
wrapped, all toast tables start to be significantly slower.

It would be much better if most database objects had their own
counters.

3) For some oid using objects (toast, large objects at the very least)
it'd be quite worthwhile to switch to 8 byte ids. Currently that's
hard to do, because it'd break on-disk compatibility.

4) There's a lot of special case code around for dealing with oids.

5a) The fact that system table oids don't show up in selects by default
makes it more work than necessary to look at catalogs.

5b) Similarly, it's fairly annoying when debugging not to trivially see
oids for catalog structs.

I think we should drop WITH OIDs support. pg_dump should convert WITH
OIDs tables into tables that have an explicit oid column (with an
appropriate default function), pg_upgrade should refuse to upgrade them.
We've defaulted WITH OIDs to off for quite a while now, so that's
hopefully not going to be too painful.

For catalog tables, I think we should just add explicit oid columns.
That obviously requires a fair amount of change, but it's not too bad.
One issue here is that we we want to support "manual" inserts both for
initdb, and for emergency work.

The attached *PROTOTYPE* *WIP* patch removes WITH OIDs support, converts
catalog table oids to explicit oid columns, and makes enough
infrastructure changes to make plain pg_regress pass (after the
necessary changes of course). Only superficial psql and no pg_dump
changes.

There's plenty of issues with the prototype, but overall I'm pretty
pleased.

There's three major areas I'm not so sure about:

1) There's a few places dealing with system tables that don't deal with
a hardcoded system table. Since there's no notion of "table has oid"
and "which column is the oid column) anymore, we need some way to
deal with that. So far I've just extended existing objectaddress.c
code to deal with that, but it's not that pretty.

2) We need to be able to manually insert into catalog tables. Both
initdb and emergency surgery. My current hack is doing so directly
in nodeModifyTable.c but that's beyond ugly. I think we should add
an explicit DEFAULT clause to those columns with something like
nextoid('tablename', 'name_of_index') or such.

3) The quickest way to deal with the bootstrap code was to just assign
all oids for oid carrying tables that don't yet have any assigned.
That doesn't generally seem terrible, although it's certainly badly
implemented right now. That'd mean we'd have three ranges of oids
probably, unless we somehow have the bootstrap code advance the
in-database oid counter to a right state before we start with
post-bootstrap work. I like the idea of all bootstrap time oids
being determined by genbki.pl (I think that'll allow to remove some
code too).

I do wonder about ripping the oid counter out entirely, and replacing it
with sequences. But that seems like a separate project.

I'll polish this up further in the not too far away future. But I'd
really like to get some feedback before I sink more time into this.

Greetings,

Andres Freund

David Fetter

david@fetter.org

over 7 years ago

In reply to: Andres Freund (#1)

Re: [RFC] Removing "magic" oids

On Sat, Sep 29, 2018 at 08:48:10PM -0700, Andres Freund wrote:

Hi,

In my opinion the current WITH OIDs system has numerous weaknesses:

1) The fact that oids are so magic means that if we get pluggable
storage, the design of the potential pluggable systems is constrained
and similar magic has to be present everywhere.

2) The fact that the oids in each table have the same counter to be
based on means that oid wraparounds have much worse consequences
performance wise than necessary. E.g. once the global counter has
wrapped, all toast tables start to be significantly slower.

It would be much better if most database objects had their own
counters.

3) For some oid using objects (toast, large objects at the very least)
it'd be quite worthwhile to switch to 8 byte ids. Currently that's
hard to do, because it'd break on-disk compatibility.

4) There's a lot of special case code around for dealing with oids.

5a) The fact that system table oids don't show up in selects by default
makes it more work than necessary to look at catalogs.

5b) Similarly, it's fairly annoying when debugging not to trivially see
oids for catalog structs.

I think we should drop WITH OIDs support.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Andreas Karlsson

andreas.karlsson@percona.com

over 7 years ago

In reply to: Andres Freund (#1)

Re: [RFC] Removing "magic" oids

On 09/30/2018 05:48 AM, Andres Freund wrote:> I think we should drop
WITH OIDs support.

2) We need to be able to manually insert into catalog tables. Both
initdb and emergency surgery. My current hack is doing so directly
in nodeModifyTable.c but that's beyond ugly. I think we should add
an explicit DEFAULT clause to those columns with something like
nextoid('tablename', 'name_of_index') or such.

Adding a function to get the next OID sounds like a good solution for
both our catalog and legacy applications. The only potential
disadvantage that I see is that this function becomes something we need
to maintain if we ever decide to move from OIDs to sequences.

3) The quickest way to deal with the bootstrap code was to just assign
all oids for oid carrying tables that don't yet have any assigned.
That doesn't generally seem terrible, although it's certainly badly
implemented right now. That'd mean we'd have three ranges of oids
probably, unless we somehow have the bootstrap code advance the
in-database oid counter to a right state before we start with
post-bootstrap work. I like the idea of all bootstrap time oids
being determined by genbki.pl (I think that'll allow to remove some
code too).

Agreed, having genbki.pl determine all oids in the bootstrap data sounds
ideal.

Andreas

Vik Fearing

vik@postgresfriends.org

over 7 years ago

In reply to: Andres Freund (#1)

Re: [RFC] Removing "magic" oids

On 30/09/18 05:48, Andres Freund wrote:

5a) The fact that system table oids don't show up in selects by default
makes it more work than necessary to look at catalogs.

As a user, this is a big pain point for me. +1 for getting rid of it.
--
Vik Fearing +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Andres Freund

andres@anarazel.de

over 7 years ago

In reply to: Andres Freund (#1)

Re: [RFC] Removing "magic" oids

Hi,

On 2018-09-29 20:48:10 -0700, Andres Freund wrote:

In my opinion the current WITH OIDs system has numerous weaknesses:

1) The fact that oids are so magic means that if we get pluggable
storage, the design of the potential pluggable systems is constrained
and similar magic has to be present everywhere.

2) The fact that the oids in each table have the same counter to be
based on means that oid wraparounds have much worse consequences
performance wise than necessary. E.g. once the global counter has
wrapped, all toast tables start to be significantly slower.

It would be much better if most database objects had their own
counters.

3) For some oid using objects (toast, large objects at the very least)
it'd be quite worthwhile to switch to 8 byte ids. Currently that's
hard to do, because it'd break on-disk compatibility.

4) There's a lot of special case code around for dealing with oids.

5a) The fact that system table oids don't show up in selects by default
makes it more work than necessary to look at catalogs.

5b) Similarly, it's fairly annoying when debugging not to trivially see
oids for catalog structs.

I think we should drop WITH OIDs support. pg_dump should convert WITH
OIDs tables into tables that have an explicit oid column (with an
appropriate default function), pg_upgrade should refuse to upgrade them.
We've defaulted WITH OIDs to off for quite a while now, so that's
hopefully not going to be too painful.

For catalog tables, I think we should just add explicit oid columns.
That obviously requires a fair amount of change, but it's not too bad.
One issue here is that we we want to support "manual" inserts both for
initdb, and for emergency work.

The attached *PROTOTYPE* *WIP* patch removes WITH OIDs support, converts
catalog table oids to explicit oid columns, and makes enough
infrastructure changes to make plain pg_regress pass (after the
necessary changes of course). Only superficial psql and no pg_dump
changes.

There's plenty of issues with the prototype, but overall I'm pretty
pleased.

There's three major areas I'm not so sure about:

1) There's a few places dealing with system tables that don't deal with
a hardcoded system table. Since there's no notion of "table has oid"
and "which column is the oid column) anymore, we need some way to
deal with that. So far I've just extended existing objectaddress.c
code to deal with that, but it's not that pretty.

2) We need to be able to manually insert into catalog tables. Both
initdb and emergency surgery. My current hack is doing so directly
in nodeModifyTable.c but that's beyond ugly. I think we should add
an explicit DEFAULT clause to those columns with something like
nextoid('tablename', 'name_of_index') or such.

3) The quickest way to deal with the bootstrap code was to just assign
all oids for oid carrying tables that don't yet have any assigned.
That doesn't generally seem terrible, although it's certainly badly
implemented right now. That'd mean we'd have three ranges of oids
probably, unless we somehow have the bootstrap code advance the
in-database oid counter to a right state before we start with
post-bootstrap work. I like the idea of all bootstrap time oids
being determined by genbki.pl (I think that'll allow to remove some
code too).

I do wonder about ripping the oid counter out entirely, and replacing it
with sequences. But that seems like a separate project.

I'll polish this up further in the not too far away future. But I'd
really like to get some feedback before I sink more time into this.

Does anybody have engineering / architecture level comments about this
proposal?

Greetings,

Andres Freund

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Andres Freund (#5)

Re: [RFC] Removing "magic" oids

Andres Freund <andres@anarazel.de> writes:

Does anybody have engineering / architecture level comments about this
proposal?

FWIW, I'm -1 on making OIDs be not-magic for SELECT purposes. Yeah, it's
a wart we wouldn't have if we designed the system today, but the wart is
thirty years old. I think changing that will break so many catalog
queries that we'll have the villagers on the doorstep. Most of the other
things you're suggesting here could be done easily without making that
change.

Possibly we could make them not-magic from the storage standpoint (ie
they're regular columns) but have a pg_attribute flag that says not
to include them in "SELECT *" expansion.

BTW, the fact that we have magic OIDs in the system catalogs doesn't
mean that any other storage system has to support that. (When I was
at Salesforce, we endured *substantial* pain from insisting that we
move the catalogs into their custom storage system. There were good
reasons to do so, but it's not a decision I'd make again if there were
any plausible way not to.)

Personally, I'd drop WITH OIDs support for user tables altogether, rather
than having pg_dump create a "compatible" translation that won't be very
compatible if it loses the magicness aspect.

regards, tom lane

Andres Freund

andres@anarazel.de

over 7 years ago

In reply to: Tom Lane (#6)

Re: [RFC] Removing "magic" oids

Hi,

On 2018-10-14 18:34:50 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

Does anybody have engineering / architecture level comments about this
proposal?

FWIW, I'm -1 on making OIDs be not-magic for SELECT purposes. Yeah, it's
a wart we wouldn't have if we designed the system today, but the wart is
thirty years old. I think changing that will break so many catalog
queries that we'll have the villagers on the doorstep. Most of the other
things you're suggesting here could be done easily without making that
change.

Yea, I agree that it'll cause some pain. And we could easily 'de-magic'
oids everywhere except the SELECT * processing (that'd probably leave
some behavioural difference with composite types, but there'd it'd
probably be purely be better).

I'm not sure we want that however - yes, the short time pain will be
lower, but do we really want to inflict the confusion about invisible
oids on our users for the next 20 years? I think there's a fair argument
to be made that we should cause pain once, rather continuing to inflict
lower doeses of pain.

Possibly we could make them not-magic from the storage standpoint (ie
they're regular columns) but have a pg_attribute flag that says not
to include them in "SELECT *" expansion.

Right. And we could just use that for system columns too, removing a
number of special case logic. Seems not unlikely that we'll have further
magic columns that want to be hidden by default, given Kevin is
pondering re-starting his work on incremental matviews.

BTW, the fact that we have magic OIDs in the system catalogs doesn't
mean that any other storage system has to support that. (When I was
at Salesforce, we endured *substantial* pain from insisting that we
move the catalogs into their custom storage system. There were good
reasons to do so, but it's not a decision I'd make again if there were
any plausible way not to.)

Right. I suspect that we at some point want to support having catalog
tables in different storage formats, but certainly not initially. Part
of the reason why I want to remove WITH OIDs support is precisely that I
do not want to add this kind of magic to other storage formats.

Personally, I'd drop WITH OIDs support for user tables altogether, rather
than having pg_dump create a "compatible" translation that won't be very
compatible if it loses the magicness aspect.

Yea, I'm on-board with that.

Greetings,

Andres Freund

Thomas Munro

thomas.munro@gmail.com

over 7 years ago

In reply to: Tom Lane (#6)

Re: [RFC] Removing "magic" oids

On Mon, Oct 15, 2018 at 11:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

Does anybody have engineering / architecture level comments about this
proposal?

FWIW, I'm -1 on making OIDs be not-magic for SELECT purposes. Yeah, it's
a wart we wouldn't have if we designed the system today, but the wart is
thirty years old. I think changing that will break so many catalog
queries that we'll have the villagers on the doorstep. Most of the other
things you're suggesting here could be done easily without making that
change.

Possibly we could make them not-magic from the storage standpoint (ie
they're regular columns) but have a pg_attribute flag that says not
to include them in "SELECT *" expansion.

FWIW there is interest in a general facility for hiding arbitrary
attributes from SELECT * for other reasons too:

/messages/by-id/CAEepm=3ZHh=p0nEEnVbs1Dig_UShPzHUcMNAqvDQUgYgcDo-pA@mail.gmail.com

--
Thomas Munro
http://www.enterprisedb.com

Robert Haas

robertmhaas@gmail.com

over 7 years ago

In reply to: Andres Freund (#7)

Re: [RFC] Removing "magic" oids

On Sun, Oct 14, 2018 at 6:43 PM Andres Freund <andres@anarazel.de> wrote:

I'm not sure we want that however - yes, the short time pain will be
lower, but do we really want to inflict the confusion about invisible
oids on our users for the next 20 years? I think there's a fair argument
to be made that we should cause pain once, rather continuing to inflict
lower doeses of pain.

Yeah, I think that argument has quite a bit of merit. I suspect that
there are a lot of people who expect that 'SELECT * FROM pg_whatever'
is going to show them all of the data in pg_whatever, and take a while
to figure out that it really doesn't. I know better and still mess
this up with some regularity. Anyone who finds it unintuitive that *
doesn't really mean everything is going to be happier with this change
in the long run.

In the short run, it is indeed possible that some catalog queries will
break. But, really, how many? For the most part, automated queries
written by tools probably select specific columns rather than
everything, and IIUC those won't break. And even if they do use
'SELECT *', won't they just end up with two copies of the OID column?
That might work fine.

Of course it might not, and then you'd have to fix your code, but it's
not obvious to me that this would be a horror show.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#10

Stephen Frost

sfrost@snowman.net

over 7 years ago

In reply to: Thomas Munro (#8)

Re: [RFC] Removing "magic" oids

Greetings,

* Thomas Munro (thomas.munro@enterprisedb.com) wrote:

On Mon, Oct 15, 2018 at 11:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

Does anybody have engineering / architecture level comments about this
proposal?

FWIW, I'm -1 on making OIDs be not-magic for SELECT purposes. Yeah, it's
a wart we wouldn't have if we designed the system today, but the wart is
thirty years old. I think changing that will break so many catalog
queries that we'll have the villagers on the doorstep. Most of the other
things you're suggesting here could be done easily without making that
change.

Possibly we could make them not-magic from the storage standpoint (ie
they're regular columns) but have a pg_attribute flag that says not
to include them in "SELECT *" expansion.

FWIW there is interest in a general facility for hiding arbitrary
attributes from SELECT * for other reasons too:

/messages/by-id/CAEepm=3ZHh=p0nEEnVbs1Dig_UShPzHUcMNAqvDQUgYgcDo-pA@mail.gmail.com

Yeah, that's exactly what I was thinking to bring up also.

There's certainly also been explicit requests for the user to be able to
control what SELECT * means, beyond our own ideas of things we'd like to
be able to add and then hide.

Thanks!

Stephen

#11

Stephen Frost

sfrost@snowman.net

over 7 years ago

In reply to: Robert Haas (#9)

Re: [RFC] Removing "magic" oids

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Sun, Oct 14, 2018 at 6:43 PM Andres Freund <andres@anarazel.de> wrote:

I'm not sure we want that however - yes, the short time pain will be
lower, but do we really want to inflict the confusion about invisible
oids on our users for the next 20 years? I think there's a fair argument
to be made that we should cause pain once, rather continuing to inflict
lower doeses of pain.

Yeah, I think that argument has quite a bit of merit. I suspect that
there are a lot of people who expect that 'SELECT * FROM pg_whatever'
is going to show them all of the data in pg_whatever, and take a while
to figure out that it really doesn't. I know better and still mess
this up with some regularity. Anyone who finds it unintuitive that *
doesn't really mean everything is going to be happier with this change
in the long run.

In the short run, it is indeed possible that some catalog queries will
break. But, really, how many? For the most part, automated queries
written by tools probably select specific columns rather than
everything, and IIUC those won't break. And even if they do use
'SELECT *', won't they just end up with two copies of the OID column?
That might work fine.

Of course it might not, and then you'd have to fix your code, but it's
not obvious to me that this would be a horror show.

I tend to agree that this won't be as much of a horror show as made out
to be up-thread. Tools should certainly be using explicit column names
and if they aren't then we're breaking them regularly anyway whenever
we change what columns exist in a given catalog table.

All the column renaming we did for v10 strikes me as a much bigger deal
than this change and while we did hear some complaints about that, I
certainly feel like it was worth it and that people generally
understood.

Thanks!

Stephen

#12

Amit Kapila

amit.kapila16@gmail.com

over 7 years ago

In reply to: Stephen Frost (#11)

Re: [RFC] Removing "magic" oids

On Wed, Oct 17, 2018 at 9:22 PM Stephen Frost <sfrost@snowman.net> wrote:

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Sun, Oct 14, 2018 at 6:43 PM Andres Freund <andres@anarazel.de> wrote:

I'm not sure we want that however - yes, the short time pain will be
lower, but do we really want to inflict the confusion about invisible
oids on our users for the next 20 years? I think there's a fair argument
to be made that we should cause pain once, rather continuing to inflict
lower doeses of pain.

Yeah, I think that argument has quite a bit of merit. I suspect that
there are a lot of people who expect that 'SELECT * FROM pg_whatever'
is going to show them all of the data in pg_whatever, and take a while
to figure out that it really doesn't. I know better and still mess
this up with some regularity. Anyone who finds it unintuitive that *
doesn't really mean everything is going to be happier with this change
in the long run.

In the short run, it is indeed possible that some catalog queries will
break. But, really, how many? For the most part, automated queries
written by tools probably select specific columns rather than
everything, and IIUC those won't break. And even if they do use
'SELECT *', won't they just end up with two copies of the OID column?
That might work fine.

Of course it might not, and then you'd have to fix your code, but it's
not obvious to me that this would be a horror show.

I tend to agree that this won't be as much of a horror show as made out
to be up-thread.

+1.

Tools should certainly be using explicit column names
and if they aren't then we're breaking them regularly anyway whenever
we change what columns exist in a given catalog table.

I think we can't say with any certainty that how many application will
break, but if I have to guess then some will surely break, however, I
think it will be a good decision for the long term.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#13

Amit Kapila

amit.kapila16@gmail.com

over 7 years ago

In reply to: Andres Freund (#1)

Re: [RFC] Removing "magic" oids

On Sun, Sep 30, 2018 at 9:18 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

In my opinion the current WITH OIDs system has numerous weaknesses:

1) The fact that oids are so magic means that if we get pluggable
storage, the design of the potential pluggable systems is constrained
and similar magic has to be present everywhere.

2) The fact that the oids in each table have the same counter to be
based on means that oid wraparounds have much worse consequences
performance wise than necessary. E.g. once the global counter has
wrapped, all toast tables start to be significantly slower.

It would be much better if most database objects had their own
counters.

3) For some oid using objects (toast, large objects at the very least)
it'd be quite worthwhile to switch to 8 byte ids. Currently that's
hard to do, because it'd break on-disk compatibility.

4) There's a lot of special case code around for dealing with oids.

5a) The fact that system table oids don't show up in selects by default
makes it more work than necessary to look at catalogs.

5b) Similarly, it's fairly annoying when debugging not to trivially see
oids for catalog structs.

I think we should drop WITH OIDs support. pg_dump should convert WITH
OIDs tables into tables that have an explicit oid column (with an
appropriate default function), pg_upgrade should refuse to upgrade them.

Is there any technical reason why you think pg_upgrade should refuse
to upgrade them? I think there is an argument to break backward
compatibility here and many people on the thread seem to be okay with
that, but refusing to upgrade sounds more restrictive.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#14

Andres Freund

andres@anarazel.de

over 7 years ago

In reply to: Amit Kapila (#13)

Re: [RFC] Removing "magic" oids

Hi,

On 2018-10-28 00:21:23 +0530, Amit Kapila wrote:

On Sun, Sep 30, 2018 at 9:18 AM Andres Freund <andres@anarazel.de> wrote:

I think we should drop WITH OIDs support. pg_dump should convert WITH
OIDs tables into tables that have an explicit oid column (with an
appropriate default function), pg_upgrade should refuse to upgrade them.

Is there any technical reason why you think pg_upgrade should refuse
to upgrade them? I think there is an argument to break backward
compatibility here and many people on the thread seem to be okay with
that, but refusing to upgrade sounds more restrictive.

They'd not be on-disk compatible, because the column isn't stored as a
normal column but in the t_hoff space.

Greetings,

Andres Freund

#15

Amit Kapila

amit.kapila16@gmail.com

over 7 years ago

In reply to: Andres Freund (#14)

Re: [RFC] Removing "magic" oids

On Sun, Oct 28, 2018 at 12:26 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2018-10-28 00:21:23 +0530, Amit Kapila wrote:

On Sun, Sep 30, 2018 at 9:18 AM Andres Freund <andres@anarazel.de> wrote:

I think we should drop WITH OIDs support. pg_dump should convert WITH
OIDs tables into tables that have an explicit oid column (with an
appropriate default function), pg_upgrade should refuse to upgrade them.

Is there any technical reason why you think pg_upgrade should refuse
to upgrade them? I think there is an argument to break backward
compatibility here and many people on the thread seem to be okay with
that, but refusing to upgrade sounds more restrictive.

They'd not be on-disk compatible, because the column isn't stored as a
normal column but in the t_hoff space.

Yeah, and which means we need to re-write all such tables which is not
an attractive option and probably quite some work. So, users have to
dump and restore their databases which can be time-consuming for large
databases, but I think we don't have any better option to offer.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#16

Andres Freund

andres@anarazel.de

over 7 years ago

In reply to: Andres Freund (#1)

Re: [RFC] Removing "magic" oids

Hi All, Noah,

Noah, see one question to you below.

On 2018-09-29 20:48:10 -0700, Andres Freund wrote:

The attached *PROTOTYPE* *WIP* patch removes WITH OIDs support, converts
catalog table oids to explicit oid columns, and makes enough
infrastructure changes to make plain pg_regress pass (after the
necessary changes of course). Only superficial psql and no pg_dump
changes.

There's plenty of issues with the prototype, but overall I'm pretty
pleased.

Here's an attached version of the prototype. Main changes besides lots
of conflict resolutions are that now all the regression tests pass. I've
also excised pg_dump support, and removed docs for WITH OIDs.

Questions:
- Currently the patch allows WITHOUT OIDS and WITH (oids = false), as
those are "harmless". Should we keep that?
- pg_dump will atm emit a warning if oids are contained in dumped table,
but not fail. Does that seem reasonable?
- pg_restore does the same, just when reading the archive
- pg_dump still puts the hasoids field into the archive. Thus the format
didn't change. That seems the most reasonable to me.
- one pgbench test tested concurrent insertions into a table with
oids, as some sort of stress test for lwlocks and spinlocks. I *think*
this doesn't really have to be a system oid column, and this was just
because that's how we triggered a bug on some machine. Noah, do I get
this right?

My reading from the discussion is that:
- we'll not have some magic compat thing in pg_dump that rewrites < 12 oids
into oid columns
- pg_upgrade will error out when it encounters an oid
- oids will remain visible

Does that seem like a fair read?

1) There's a few places dealing with system tables that don't deal with
a hardcoded system table. Since there's no notion of "table has oid"
and "which column is the oid column) anymore, we need some way to
deal with that. So far I've just extended existing objectaddress.c
code to deal with that, but it's not that pretty.

I think that can reverted, if we take care of 2), now that I cleaned up
some of the uglier hacks. Right now the consequence of this is that
it's possible to insert into system catalogs without specifying the oid
only for catalogs with objectaddress.c support.

2) We need to be able to manually insert into catalog tables. Both
initdb and emergency surgery. My current hack is doing so directly
in nodeModifyTable.c but that's beyond ugly. I think we should add
an explicit DEFAULT clause to those columns with something like
nextoid('tablename', 'name_of_index') or such.

It'd be nextoid('tablename', 'oid', 'name_of_index'), I think. There's
a bit of a sequencing issue about when to add those column defaults, so
initdb can properly insert, but it doesn't sound too hard.

3) The quickest way to deal with the bootstrap code was to just assign
all oids for oid carrying tables that don't yet have any assigned.
That doesn't generally seem terrible, although it's certainly badly
implemented right now. That'd mean we'd have three ranges of oids
probably, unless we somehow have the bootstrap code advance the
in-database oid counter to a right state before we start with
post-bootstrap work. I like the idea of all bootstrap time oids
being determined by genbki.pl (I think that'll allow to remove some
code too).

I've not done anything about this so far.

Greetings,

Andres Freund

#17

Andres Freund

andres@anarazel.de

over 7 years ago

In reply to: Andres Freund (#16)

Re: [RFC] Removing "magic" oids

Hi,

On 2018-11-14 00:01:52 -0800, Andres Freund wrote:

1) There's a few places dealing with system tables that don't deal with
a hardcoded system table. Since there's no notion of "table has oid"
and "which column is the oid column) anymore, we need some way to
deal with that. So far I've just extended existing objectaddress.c
code to deal with that, but it's not that pretty.

I think that can reverted, if we take care of 2), now that I cleaned up
some of the uglier hacks. Right now the consequence of this is that
it's possible to insert into system catalogs without specifying the oid
only for catalogs with objectaddress.c support.

2) We need to be able to manually insert into catalog tables. Both
initdb and emergency surgery. My current hack is doing so directly
in nodeModifyTable.c but that's beyond ugly. I think we should add
an explicit DEFAULT clause to those columns with something like
nextoid('tablename', 'name_of_index') or such.

It'd be nextoid('tablename', 'oid', 'name_of_index'), I think. There's
a bit of a sequencing issue about when to add those column defaults, so
initdb can properly insert, but it doesn't sound too hard.

I had contemplated adding the above function as the default on system
columns - but I think that'd add significant complications to
relcache.c. As it's not actually a common occurance to manually insert
into catalog tables, my revised plan is to provide a pg_nextoid()
function, but *not* set it as the default. The only case in postgres
where that matters is the manual insertions into pg_depend during
initdb, but that query can trivially be adapted.

Currently I'm not planning to document the use pg_nextoid(), it's not
something users really should do. What I wonder about however is where
it should best live - I've put into catalog.c for now, but I'm not
entirely happy with that.

I've thus taken out the nodeModifyTable.c hack that assigned oids at
INSERT. Thus insertions without oid specified now raise a NOT NULL
violation (which can be fixed by using the pg_nextoid() function).

Comments?

3) The quickest way to deal with the bootstrap code was to just assign
all oids for oid carrying tables that don't yet have any assigned.
That doesn't generally seem terrible, although it's certainly badly
implemented right now. That'd mean we'd have three ranges of oids
probably, unless we somehow have the bootstrap code advance the
in-database oid counter to a right state before we start with
post-bootstrap work. I like the idea of all bootstrap time oids
being determined by genbki.pl (I think that'll allow to remove some
code too).

I've not done anything about this so far.

I've now revised this slightly. genbki.pl now computes the maximum oid
explicitly assigned in .dat files, and assignes oids to all 'oid'
columns without a value. It does so starting directly at the maximum
value. I personally don't see need to have implicit .bki oids be in a
different range, and having them assigned more densely is good for some
things (e.g. the fmgr builtins table, even though we currently assign
all proc oids manually).

Differing opinions about this?

Attached is revised version of the patch. I've tried to find all
references to oids and revise them accordingly. The docs changes
probably need a polish.

I quite like the diffstat:
b0e170f2c8c57b683c74e6cccfb7e46356dbacea (HEAD -> oidnix) Remove WITH OIDs support.
333 files changed, 1949 insertions(+), 4030 deletions(-)

While clearly not ready yet, I don't think it's that far off.

Missing:
- docs polish
- pg_upgrade early error
- discussion of the pg_dump/restore behaviour when encountering tables
or archives with oids. It currently warns. If we want to keep it that
way - which I think is reasonable - a bit more code can be excised.

Greetings,

Andres Freund

#18

Noah Misch

noah@leadboat.com

over 7 years ago

In reply to: Andres Freund (#16)

Re: [RFC] Removing "magic" oids

On Wed, Nov 14, 2018 at 12:01:52AM -0800, Andres Freund wrote:

- one pgbench test tested concurrent insertions into a table with
oids, as some sort of stress test for lwlocks and spinlocks. I *think*
this doesn't really have to be a system oid column, and this was just
because that's how we triggered a bug on some machine. Noah, do I get
this right?

The point of the test is to exercise OidGenLock by issuing many parallel
GetNewOidWithIndex() and verifying absence of duplicates. There's nothing
special about OidGenLock, but it is important to use an operation that takes a
particular LWLock many times, quickly. If the test query spends too much time
on things other than taking locks, it will catch locking races too rarely.

--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -48,28 +48,26 @@ sub pgbench
return;
}

-# Test concurrent insertion into table with UNIQUE oid column.  DDL expects
-# GetNewOidWithIndex() to successfully avoid violating uniqueness for indexes
-# like pg_class_oid_index and pg_proc_oid_index.  This indirectly exercises
-# LWLock and spinlock concurrency.  This test makes a 5-MiB table.
+# Test concurrent insertion into table with serial column.  This
+# indirectly exercises LWLock and spinlock concurrency.  This test
+# makes a 5-MiB table.

$node->safe_psql('postgres',
-	    'CREATE UNLOGGED TABLE oid_tbl () WITH OIDS; '
-	  . 'ALTER TABLE oid_tbl ADD UNIQUE (oid);');
+	    'CREATE UNLOGGED TABLE insert_tbl (id serial primary key); ');

pgbench(
'--no-vacuum --client=5 --protocol=prepared --transactions=25',
0,
[qr{processed: 125/125}],
[qr{^$}],
-	'concurrency OID generation',
+	'concurrent insert generation',
{
-		'001_pgbench_concurrent_oid_generation' =>
-		  'INSERT INTO oid_tbl SELECT FROM generate_series(1,1000);'
+		'001_pgbench_concurrent_insert' =>
+		  'INSERT INTO insert_tbl SELECT FROM generate_series(1,1000);'

The code for sequences is quite different, so this may or may not be an
effective replacement. To study that, you could remove a few barriers from
lwlock.c, measure how many iterations today's test needs to catch the
mutation, and then measure the same for this proposal.

#19

Andres Freund

andres@anarazel.de

over 7 years ago

In reply to: Noah Misch (#18)

Re: [RFC] Removing "magic" oids

Hi,

On 2018-11-15 04:57:28 +0000, Noah Misch wrote:

On Wed, Nov 14, 2018 at 12:01:52AM -0800, Andres Freund wrote:

- one pgbench test tested concurrent insertions into a table with
oids, as some sort of stress test for lwlocks and spinlocks. I *think*
this doesn't really have to be a system oid column, and this was just
because that's how we triggered a bug on some machine. Noah, do I get
this right?

The point of the test is to exercise OidGenLock by issuing many parallel
GetNewOidWithIndex() and verifying absence of duplicates. There's nothing
special about OidGenLock, but it is important to use an operation that takes a
particular LWLock many times, quickly. If the test query spends too much time
on things other than taking locks, it will catch locking races too rarely.

Sequences ought to do that, too. And if it's borked, we'd hopefully see
unique violations. But it's definitely not a 1:1 replacement.

pgbench(
'--no-vacuum --client=5 --protocol=prepared --transactions=25',
0,
[qr{processed: 125/125}],
[qr{^$}],
-	'concurrency OID generation',
+	'concurrent insert generation',
{
-		'001_pgbench_concurrent_oid_generation' =>
-		  'INSERT INTO oid_tbl SELECT FROM generate_series(1,1000);'
+		'001_pgbench_concurrent_insert' =>
+		  'INSERT INTO insert_tbl SELECT FROM generate_series(1,1000);'
The code for sequences is quite different, so this may or may not be an
effective replacement. To study that, you could remove a few barriers from
lwlock.c, measure how many iterations today's test needs to catch the
mutation, and then measure the same for this proposal.

Unfortunately it's really hard to hit barrier issues on x86. I think
that's the only arch I currently have access to, but it's possible I
have access to some ppc too. If you have a better idea for a
replacement test, I'd be all ears.

Thanks!

Andres

#20

John Naylor

john.naylor@enterprisedb.com

over 7 years ago

In reply to: Andres Freund (#17)

Re: [RFC] Removing "magic" oids

On 11/15/18, Andres Freund <andres@anarazel.de> wrote:

I've now revised this slightly. genbki.pl now computes the maximum oid
explicitly assigned in .dat files, and assignes oids to all 'oid'
columns without a value. It does so starting directly at the maximum
value. I personally don't see need to have implicit .bki oids be in a
different range, and having them assigned more densely is good for some
things (e.g. the fmgr builtins table, even though we currently assign
all proc oids manually).

I don't see an advantage to having a different range, but maybe it
should error out if $maxoid reaches FirstBootstrapObjectId.

This patch breaks reformat_dat_file.pl. I've attached a fix, which
also de-lists oid as a special key within the *.dat files. It might be
good to put off reformatting until feature freeze, so as not to break
others' patches.

-John Naylor

#21

Andres Freund

andres@anarazel.de

over 7 years ago

In reply to: John Naylor (#20)

#22

John Naylor

john.naylor@enterprisedb.com

over 7 years ago

In reply to: Andres Freund (#21)

#23