automatically assigning catalog toast oids

Started by John Naylorabout 7 years ago20 messages
#1John Naylor
jcnaylor@gmail.com
1 attachment(s)

Commit 96cdeae07 added toast tables to most catalogs. One disadvantage
is that the toast declarations require hard-coded oids, even though
only shared catalogs actually need stable oids. Now that we can assign
oids on the fly, it makes sense to do so for toast tables as well, as
in the attached.

-John Naylor

Attachments:

assign-toast-oids-on-the-fly.patchtext/x-patch; charset=US-ASCII; name=assign-toast-oids-on-the-fly.patchDownload
 src/backend/catalog/genbki.pl  | 26 ++++++++++++++++++++++++++
 src/include/catalog/toasting.h | 32 ++------------------------------
 2 files changed, 28 insertions(+), 30 deletions(-)

diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index edc8ea9f53..f4d7160c6a 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -285,6 +285,7 @@ my @tables_needing_macros;
 foreach my $catname (@catnames)
 {
 	my $catalog = $catalogs{$catname};
+	my $has_toastable_type = 0;
 
 	# Create one definition header with macro definitions for each catalog.
 	my $def_file = $output_path . $catname . '_d.h';
@@ -345,6 +346,12 @@ EOM
 		# Build hash of column names for use later
 		$attnames{$attname} = 1;
 
+		# Flag catalogs with toastable datatypes.
+		if ($types{$atttype}->{typstorage} eq 'x')
+		{
+			$has_toastable_type = 1;
+		}
+
 		# Emit column definitions
 		if (!$first)
 		{
@@ -506,6 +513,25 @@ EOM
 		}
 	}
 
+	# Create toast declarations for normal catalogs.  To prevent
+	# circular dependencies and to avoid adding complexity to VACUUM
+	# FULL logic, exclude pg_class, pg_attribute, and pg_index.  Also,
+	# to prevent pg_upgrade from seeing a non-empty new cluster, exclude
+	# pg_largeobject and pg_largeobject_metadata from the set as large
+	# object data is handled as user data.  Those relations have no reason
+	# to use a toast table anyway.  Toast tables and indexes for shared
+	# catalogs need stable oids, so must be specified in toasting.h.
+	if ($has_toastable_type and !$catalog->{shared_relation}
+		and $catname ne 'pg_class'
+		and $catname ne 'pg_attribute'
+		and $catname ne 'pg_index'
+		and $catname ne 'pg_largeobject'
+		and $catname ne 'pg_largeobject_metadata')
+	{
+		push @toast_decls, sprintf "declare toast %s %s on %s\n",
+		  $maxoid++, $maxoid++, $catname;
+	}
+
 	print $bki "close $catname\n";
 	printf $def "\n#endif\t\t\t\t\t\t\t/* %s_D_H */\n", uc $catname;
 
diff --git a/src/include/catalog/toasting.h b/src/include/catalog/toasting.h
index f259890e43..1c389c685d 100644
--- a/src/include/catalog/toasting.h
+++ b/src/include/catalog/toasting.h
@@ -42,37 +42,9 @@ extern void BootstrapToastTable(char *relName,
  * the OID to assign to the toast table, and the OID to assign to the
  * toast table's index.  The reason we hard-wire these OIDs is that we
  * need stable OIDs for shared relations, and that includes toast tables
- * of shared relations.
+ * of shared relations.  Toast table commands for normal catalogs are
+ * generated by genbki.pl, and oids for those are assigned on the fly.
  */
-
-/* normal catalogs */
-DECLARE_TOAST(pg_aggregate, 4159, 4160);
-DECLARE_TOAST(pg_attrdef, 2830, 2831);
-DECLARE_TOAST(pg_collation, 4161, 4162);
-DECLARE_TOAST(pg_constraint, 2832, 2833);
-DECLARE_TOAST(pg_default_acl, 4143, 4144);
-DECLARE_TOAST(pg_description, 2834, 2835);
-DECLARE_TOAST(pg_event_trigger, 4145, 4146);
-DECLARE_TOAST(pg_extension, 4147, 4148);
-DECLARE_TOAST(pg_foreign_data_wrapper, 4149, 4150);
-DECLARE_TOAST(pg_foreign_server, 4151, 4152);
-DECLARE_TOAST(pg_foreign_table, 4153, 4154);
-DECLARE_TOAST(pg_init_privs, 4155, 4156);
-DECLARE_TOAST(pg_language, 4157, 4158);
-DECLARE_TOAST(pg_namespace, 4163, 4164);
-DECLARE_TOAST(pg_partitioned_table, 4165, 4166);
-DECLARE_TOAST(pg_policy, 4167, 4168);
-DECLARE_TOAST(pg_proc, 2836, 2837);
-DECLARE_TOAST(pg_rewrite, 2838, 2839);
-DECLARE_TOAST(pg_seclabel, 3598, 3599);
-DECLARE_TOAST(pg_statistic, 2840, 2841);
-DECLARE_TOAST(pg_statistic_ext, 3439, 3440);
-DECLARE_TOAST(pg_trigger, 2336, 2337);
-DECLARE_TOAST(pg_ts_dict, 4169, 4170);
-DECLARE_TOAST(pg_type, 4171, 4172);
-DECLARE_TOAST(pg_user_mapping, 4173, 4174);
-
-/* shared catalogs */
 DECLARE_TOAST(pg_authid, 4175, 4176);
 #define PgAuthidToastTable 4175
 #define PgAuthidToastIndex 4176
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#1)
Re: automatically assigning catalog toast oids

John Naylor <jcnaylor@gmail.com> writes:

Commit 96cdeae07 added toast tables to most catalogs. One disadvantage
is that the toast declarations require hard-coded oids, even though
only shared catalogs actually need stable oids. Now that we can assign
oids on the fly, it makes sense to do so for toast tables as well, as
in the attached.

I'm a bit dubious that this is a good idea. It's handy, at least for
forensic situations, that the system catalogs have stable OIDs.

regards, tom lane

#3Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#2)
Re: automatically assigning catalog toast oids

Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

John Naylor <jcnaylor@gmail.com> writes:

Commit 96cdeae07 added toast tables to most catalogs. One disadvantage
is that the toast declarations require hard-coded oids, even though
only shared catalogs actually need stable oids. Now that we can assign
oids on the fly, it makes sense to do so for toast tables as well, as
in the attached.

I'm a bit dubious that this is a good idea. It's handy, at least for
forensic situations, that the system catalogs have stable OIDs.

I tend to agree... What's the advantage of assigning them on the fly?

Thanks!

Stephen

#4Andres Freund
andres@anarazel.de
In reply to: Stephen Frost (#3)
Re: automatically assigning catalog toast oids

Hi,

On 2018-12-09 15:42:57 -0500, Stephen Frost wrote:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

John Naylor <jcnaylor@gmail.com> writes:

Commit 96cdeae07 added toast tables to most catalogs. One disadvantage
is that the toast declarations require hard-coded oids, even though
only shared catalogs actually need stable oids. Now that we can assign
oids on the fly, it makes sense to do so for toast tables as well, as
in the attached.

I'm a bit dubious that this is a good idea. It's handy, at least for
forensic situations, that the system catalogs have stable OIDs.

Hm, but won't they have that for major versions anyway? We ought not to
change the .bki generation in a way that results in differing oids after
a release, no?

I tend to agree... What's the advantage of assigning them on the fly?

No idea if that's John's reasoning, but I do like not having to do yet
another manual step that you need to remember/figure out when adding a
new catalog relation.

Greetings,

Andres Freund

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#4)
Re: automatically assigning catalog toast oids

Andres Freund <andres@anarazel.de> writes:

On 2018-12-09 15:42:57 -0500, Stephen Frost wrote:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

I'm a bit dubious that this is a good idea. It's handy, at least for
forensic situations, that the system catalogs have stable OIDs.

Hm, but won't they have that for major versions anyway? We ought not to
change the .bki generation in a way that results in differing oids after
a release, no?

Well, that's just a different very-easily-broken assumption. There are
a lot of things that make auto-assigned OIDs unstable, and I do not think
that we want to guarantee that they'll hold still across a release series.

BTW, now that I'm looking at it, I very much dislike the way commit
578b2297 handled auto-assignment of OIDs in catalogs. Not only do I not
want to assign more OIDs that way, I don't think we can safely ship v12
like this. There's basically nothing at all that guarantees
genbki-assigned OIDs won't overlap initdb-assigned OIDs. In fact,
I think it's about guaranteed to blow up in the face of anybody who
inserts manually-assigned OIDs above around 9000.

What we probably need to do is restructure the FirstBootstrapObjectId
business so that there are separate, well-defined ranges for genbki.pl
and initdb to use.

regards, tom lane

#6Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: automatically assigning catalog toast oids

Hi,

On 2018-12-09 17:14:42 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2018-12-09 15:42:57 -0500, Stephen Frost wrote:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

I'm a bit dubious that this is a good idea. It's handy, at least for
forensic situations, that the system catalogs have stable OIDs.

Hm, but won't they have that for major versions anyway? We ought not to
change the .bki generation in a way that results in differing oids after
a release, no?

Well, that's just a different very-easily-broken assumption. There are
a lot of things that make auto-assigned OIDs unstable, and I do not think
that we want to guarantee that they'll hold still across a release series.

Why wouldn't they be for genbki (rather than initdb) assigned oids? I
don't think it's reasonable to add new functions or such post release
that would have move oid assignments for other objects?

BTW, now that I'm looking at it, I very much dislike the way commit
578b2297 handled auto-assignment of OIDs in catalogs. Not only do I not
want to assign more OIDs that way

Why do you not want that?

, I don't think we can safely ship v12 like this. There's basically
nothing at all that guarantees genbki-assigned OIDs won't overlap
initdb-assigned OIDs. In fact, I think it's about guaranteed to blow
up in the face of anybody who inserts manually-assigned OIDs above
around 9000.

I'm not sure I really see a pressing problem with that. I think it'd not
be insane to treat this as a "well, don't do that then".

What we probably need to do is restructure the FirstBootstrapObjectId
business so that there are separate, well-defined ranges for genbki.pl
and initdb to use.

I'm fine with adding a distinct range, the earlier version of the patch
had that. I'd asked for comments if anybody felt a need to keep that,
nobody replied... I alternatively proposed that we could just start at
FirstNormalObjectId for those and update the server's oid start value to
the maximum genbki assigned oid. Do you have preferences around that?

Greetings,

Andres Freund

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#6)
Re: automatically assigning catalog toast oids

Andres Freund <andres@anarazel.de> writes:

On 2018-12-09 17:14:42 -0500, Tom Lane wrote:

Well, that's just a different very-easily-broken assumption. There are
a lot of things that make auto-assigned OIDs unstable, and I do not think
that we want to guarantee that they'll hold still across a release series.

Why wouldn't they be for genbki (rather than initdb) assigned oids? I
don't think it's reasonable to add new functions or such post release
that would have move oid assignments for other objects?

As you've got this set up, we couldn't change *anything* for fear of
it moving auto-assignments; there's no isolation between catalogs.

Another thing I seriously dislike is that this allows people to omit OIDs
from .dat entries in catalogs where we traditionally hand-assign OIDs.
That's not a good idea because it would mean those entries don't have
stable OIDs, whereas the whole point of hand assignment is to ensure
all built-in objects of a particular type have stable OIDs. Now, you
could argue about the usefulness of that policy for any given catalog;
but if we decide that catalog X doesn't need stable OIDs then that should
be an intentional policy change, not something that can happen because
one lazy hacker didn't follow the policy.

(This is, btw, another reason why I don't much like allowing toast
tables to not have stable OIDs; it confuses what the policy is for
pg_class.)

In fact, I think it's about guaranteed to blow
up in the face of anybody who inserts manually-assigned OIDs above
around 9000.

I'm not sure I really see a pressing problem with that. I think it'd not
be insane to treat this as a "well, don't do that then".

I can name more than one company that will be pretty damn unhappy when
we break their local patches that make use of 9K-range OIDs. For better
or worse, the policy for the last 20 years has been that OIDs up to 9999
are available for hand assignment. What you've just done is to break
that for OIDs above some boundary that's not even very well defined
(though it seems to be somewhere around 8500 as of HEAD).

What we probably need to do is restructure the FirstBootstrapObjectId
business so that there are separate, well-defined ranges for genbki.pl
and initdb to use.

I'm fine with adding a distinct range, the earlier version of the patch
had that. I'd asked for comments if anybody felt a need to keep that,
nobody replied... I alternatively proposed that we could just start at
FirstNormalObjectId for those and update the server's oid start value to
the maximum genbki assigned oid. Do you have preferences around that?

Yeah, I thought about the latter as well. But it adds complexity to the
bootstrap process and makes it harder to tell what assigned a particular
OID, so I'd rather go with the former, at least until the OID situation
gets too tight to allow for daylight between the ranges.

It looks to me like as of HEAD, genbki.pl is auto-assigning about 1470
OIDs. Meanwhile, on my RHEL6 machine, initdb is auto-assigning about
1740 OIDs (what a coincidence); of those, 872 are collation entries
that are absorbed from the system environment. So the second number is
likely to vary a lot from platform to platform. (I don't have ICU
enabled; I wonder how many that typically adds.)

I'd be inclined to allow say 2000 OIDs for genbki.pl, with 4384 therefore
available for initdb. We could expect to have to raise the boundary
from time to time, but not very often.

regards, tom lane

#8John Naylor
jcnaylor@gmail.com
In reply to: Tom Lane (#7)
Re: automatically assigning catalog toast oids

On 12/9/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Another thing I seriously dislike is that this allows people to omit OIDs
from .dat entries in catalogs where we traditionally hand-assign OIDs.
That's not a good idea because it would mean those entries don't have
stable OIDs, whereas the whole point of hand assignment is to ensure
all built-in objects of a particular type have stable OIDs. Now, you
could argue about the usefulness of that policy for any given catalog;
but if we decide that catalog X doesn't need stable OIDs then that should
be an intentional policy change, not something that can happen because
one lazy hacker didn't follow the policy.

On this point, I believe this could have happened anyway. pg_opclass
has a mix of hand- and initdb-assigned oids, and there was nothing
previously to stop that from creeping into any other catalog, as far
as I can tell.

-John Naylor

#9Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#7)
1 attachment(s)
Re: automatically assigning catalog toast oids

Hi,

On 2018-12-09 18:43:14 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2018-12-09 17:14:42 -0500, Tom Lane wrote:

Well, that's just a different very-easily-broken assumption. There are
a lot of things that make auto-assigned OIDs unstable, and I do not think
that we want to guarantee that they'll hold still across a release series.

Why wouldn't they be for genbki (rather than initdb) assigned oids? I
don't think it's reasonable to add new functions or such post release
that would have move oid assignments for other objects?

As you've got this set up, we couldn't change *anything* for fear of
it moving auto-assignments; there's no isolation between catalogs.

But there wasn't any previously either?

Another thing I seriously dislike is that this allows people to omit OIDs
from .dat entries in catalogs where we traditionally hand-assign OIDs.

That's not new, is it? Sure, now genbki.pl assigns the oid, but
previously it'd just have been heap_insert()? bootparse.y/bootstrap.c
never enforced that oids are assigned for tables that have oids.

That's not a good idea because it would mean those entries don't have
stable OIDs, whereas the whole point of hand assignment is to ensure
all built-in objects of a particular type have stable OIDs. Now, you
could argue about the usefulness of that policy for any given catalog;
but if we decide that catalog X doesn't need stable OIDs then that should
be an intentional policy change, not something that can happen because
one lazy hacker didn't follow the policy.

I think we should change that policy, but I also think that there wasn't
any meaningful "assignment policy" change in what I did. So that just
seems like a separate argument.

Note that changing that for "prominent" catalogs would be a bit more
work than just changing the policy, as we'd need to assign oids before
the lookup tables are built - although the current behaviour would kind
of allow us to implement the "not crazy" policy of allowing
auto-assignment as long as the object isn't referenced; but via an imo
fairly opaque mechanism.

I'm fine with adding a distinct range, the earlier version of the patch
had that. I'd asked for comments if anybody felt a need to keep that,
nobody replied... I alternatively proposed that we could just start at
FirstNormalObjectId for those and update the server's oid start value to
the maximum genbki assigned oid. Do you have preferences around that?

Yeah, I thought about the latter as well. But it adds complexity to the
bootstrap process and makes it harder to tell what assigned a particular
OID, so I'd rather go with the former, at least until the OID situation
gets too tight to allow for daylight between the ranges.

Yea, it doesn't seem perfect, that's basically why I didn't go for it
last time.

It looks to me like as of HEAD, genbki.pl is auto-assigning about 1470
OIDs. Meanwhile, on my RHEL6 machine, initdb is auto-assigning about
1740 OIDs (what a coincidence); of those, 872 are collation entries
that are absorbed from the system environment. So the second number is
likely to vary a lot from platform to platform. (I don't have ICU
enabled; I wonder how many that typically adds.)

I'd be inclined to allow say 2000 OIDs for genbki.pl, with 4384 therefore
available for initdb. We could expect to have to raise the boundary
from time to time, but not very often.

I've attached a patch implementing that. I'm not particularly in love
with FirstGenbkiObjectId as the symbol, but I couldn't think of
something more descriptive.

I changed the length of fmgr_builtin_oid_index to FirstGenbkiObjectId -
until we allow pg_proc oids to be auto-assigned that'd just be wasted
memory otherwise?

I did *not* change record_plan_function_dependency(), it seems correct
that it doesn't track genbki assigned oids, they certainly can't change
while a server is running. But I'm not entirely clear to why that's not
using FirstNormalObjectId as the cutoff, so perhaps I'm missing
something. Similar with logic in predicate.c.

I did however change postgres_fdw's is_builtin(), as that says:
 /*
  * Return true if given object is one of PostgreSQL's built-in objects.
  *
- * We use FirstBootstrapObjectId as the cutoff, so that we only consider
+ * We use FirstGenbkiObjectId as the cutoff, so that we only consider
  * objects with hand-assigned OIDs to be "built in", not for instance any
  * function or type defined in the information_schema.
  *
@@ -154,7 +154,7 @@ lookup_shippable(Oid objectId, Oid classId, PgFdwRelationInfo *fpinfo)

and >= FirstGenbkiObjectId would not be maniually assigned.

I added a throwaway "with 9000-9999 tentatively reserved for forks." to
transam.h, but I'm not sure we really want that, or whether that's good
wording.

Greetings,

Andres Freund

Attachments:

genbki-object-id-range.difftext/x-diff; charset=us-asciiDownload
diff --git a/contrib/postgres_fdw/shippable.c b/contrib/postgres_fdw/shippable.c
index 7f2ed0499c0..fb98faa5fd7 100644
--- a/contrib/postgres_fdw/shippable.c
+++ b/contrib/postgres_fdw/shippable.c
@@ -137,7 +137,7 @@ lookup_shippable(Oid objectId, Oid classId, PgFdwRelationInfo *fpinfo)
 /*
  * Return true if given object is one of PostgreSQL's built-in objects.
  *
- * We use FirstBootstrapObjectId as the cutoff, so that we only consider
+ * We use FirstGenbkiObjectId as the cutoff, so that we only consider
  * objects with hand-assigned OIDs to be "built in", not for instance any
  * function or type defined in the information_schema.
  *
@@ -154,7 +154,7 @@ lookup_shippable(Oid objectId, Oid classId, PgFdwRelationInfo *fpinfo)
 bool
 is_builtin(Oid objectId)
 {
-	return (objectId < FirstBootstrapObjectId);
+	return (objectId < FirstGenbkiObjectId);
 }
 
 /*
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 0865240f11f..aaa3af7e5a1 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -88,7 +88,9 @@ generated-header-symlinks: $(top_builddir)/src/include/catalog/header-stamp
 # instead is cheating a bit, but it will achieve the goal of updating the
 # version number when it changes.
 bki-stamp: genbki.pl Catalog.pm $(POSTGRES_BKI_SRCS) $(POSTGRES_BKI_DATA) $(top_srcdir)/configure.in
-	$(PERL) -I $(catalogdir) $< --set-version=$(MAJORVERSION) $(POSTGRES_BKI_SRCS)
+	$(PERL) -I $(catalogdir) $< \
+		-I $(top_srcdir)/src/include/ --set-version=$(MAJORVERSION) \
+		$(POSTGRES_BKI_SRCS)
 	touch $@
 
 # The generated headers must all be symlinked into builddir/src/include/,
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index edc8ea9f533..8e2a2480be6 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -22,6 +22,7 @@ use warnings;
 my @input_files;
 my $output_path = '';
 my $major_version;
+my $include_path;
 
 # Process command line switches.
 while (@ARGV)
@@ -31,6 +32,10 @@ while (@ARGV)
 	{
 		push @input_files, $arg;
 	}
+	elsif ($arg =~ /^-I/)
+	{
+		$include_path = length($arg) > 2 ? substr($arg, 2) : shift @ARGV;
+	}
 	elsif ($arg =~ /^-o/)
 	{
 		$output_path = length($arg) > 2 ? substr($arg, 2) : shift @ARGV;
@@ -50,6 +55,7 @@ while (@ARGV)
 # Sanity check arguments.
 die "No input files.\n" if !@input_files;
 die "--set-version must be specified.\n" if !defined $major_version;
+die "-I, the header include path, must be specified.\n" if !$include_path;
 
 # Make sure output_path ends in a slash.
 if ($output_path ne '' && substr($output_path, -1) ne '/')
@@ -133,17 +139,9 @@ foreach my $header (@input_files)
 # While duplicate OIDs would only cause a failure if they appear in
 # the same catalog, our project policy is that manually assigned OIDs
 # should be globally unique, to avoid confusion.
-#
-# Also use the loop to determine the maximum explicitly assigned oid
-# found in the data file, we'll use that for default oid assignments.
 my $found = 0;
-my $maxoid = 0;
 foreach my $oid (keys %oidcounts)
 {
-	if ($oid > $maxoid)
-	{
-		$maxoid = $oid;
-	}
 	next unless $oidcounts{$oid} > 1;
 	print STDERR "Duplicate OIDs detected:\n" if !$found;
 	print STDERR "$oid\n";
@@ -151,6 +149,15 @@ foreach my $oid (keys %oidcounts)
 }
 die "found $found duplicate OID(s) in catalog data\n" if $found;
 
+
+# Oids not specified in the input files are automatically assigned,
+# starting at FirstGenbkiObjectId.
+my $FirstGenbkiObjectId =
+  Catalog::FindDefinedSymbol('access/transam.h', $include_path,
+	'FirstGenbkiObjectId');
+my $GenbkiNextOid = $FirstGenbkiObjectId;
+
+
 # Fetch some special data that we will substitute into the output file.
 # CAUTION: be wary about what symbols you substitute into the .bki file here!
 # It's okay to substitute things that are expected to be really constant
@@ -418,8 +425,8 @@ EOM
 			# Assign oid if oid column exists and no explicit assignment in row
 			if ($attname eq "oid" and not defined $bki_values{$attname})
 			{
-				$bki_values{$attname} = $maxoid;
-				$maxoid++;
+				$bki_values{$attname} = $GenbkiNextOid;
+				$GenbkiNextOid++;
 			}
 
 			# Substitute constant values we acquired above.
@@ -858,6 +865,7 @@ sub usage
 Usage: genbki.pl [options] header...
 
 Options:
+    -I               include path
     -o               output path
     --set-version    PostgreSQL version number for initdb cross-check
 
diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl
index ca282913552..ed16737a6a1 100644
--- a/src/backend/utils/Gen_fmgrtab.pl
+++ b/src/backend/utils/Gen_fmgrtab.pl
@@ -79,9 +79,9 @@ foreach my $datfile (@input_files)
 }
 
 # Fetch some values for later.
-my $FirstBootstrapObjectId =
+my $FirstGenbkiObjectId =
   Catalog::FindDefinedSymbol('access/transam.h', $include_path,
-	'FirstBootstrapObjectId');
+	'FirstGenbkiObjectId');
 my $INTERNALlanguageId =
   Catalog::FindDefinedSymbolFromData($catalog_data{pg_language},
 	'INTERNALlanguageId');
@@ -252,13 +252,13 @@ const int fmgr_nbuiltins = (sizeof(fmgr_builtins) / sizeof(FmgrBuiltin));
 
 # Create fmgr_builtins_oid_index table.
 #
-# Note that the array has to be filled up to FirstBootstrapObjectId,
+# Note that the array has to be filled up to FirstGenbkiObjectId,
 # as we can't rely on zero initialization as 0 is a valid mapping.
 print $tfh qq|
-const uint16 fmgr_builtin_oid_index[FirstBootstrapObjectId] = {
+const uint16 fmgr_builtin_oid_index[FirstGenbkiObjectId] = {
 |;
 
-for (my $i = 0; $i < $FirstBootstrapObjectId; $i++)
+for (my $i = 0; $i < $FirstGenbkiObjectId; $i++)
 {
 	my $oid = $fmgr_builtin_oid_index[$i];
 
@@ -269,7 +269,7 @@ for (my $i = 0; $i < $FirstBootstrapObjectId; $i++)
 		$oid = 'InvalidOidBuiltinMapping';
 	}
 
-	if ($i + 1 == $FirstBootstrapObjectId)
+	if ($i + 1 == $FirstGenbkiObjectId)
 	{
 		print $tfh "  $oid\n";
 	}
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 73ff48c1963..231bf885989 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -75,7 +75,7 @@ fmgr_isbuiltin(Oid id)
 	uint16		index;
 
 	/* fast lookup only possible if original oid still assigned */
-	if (id >= FirstBootstrapObjectId)
+	if (id >= FirstGenbkiObjectId)
 		return NULL;
 
 	/*
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 83ec3f19797..f979a065f9f 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -71,18 +71,21 @@
 /* ----------
  *		Object ID (OID) zero is InvalidOid.
  *
- *		OIDs 1-9999 are reserved for manual assignment (see the files
- *		in src/include/catalog/).
+ *		OIDs 1-9999 are reserved for manual assignment (see .dat files in
+ *		src/include/catalog/), with 9000-9999 tentatively reserved for forks.
  *
- *		OIDS 10000-16383 are reserved for assignment during initdb
- *		using the OID generator.  (We start the generator at 10000.)
+ *		OIDs 10000-12000 are reserved for assignment by genbki.pl, when the
+ *		.dat files in src/include/catalog/ do not specify oids.
+ *
+ *		OIDS 12000-16383 are reserved for assignment during initdb
+ *		using the OID generator.  (We start the generator at 12000.)
  *
  *		OIDs beginning at 16384 are assigned from the OID generator
  *		during normal multiuser operation.  (We force the generator up to
  *		16384 as soon as we are in normal operation.)
  *
- * The choices of 10000 and 16384 are completely arbitrary, and can be moved
- * if we run low on OIDs in either category.  Changing the macros below
+ * The choices of 10000, 12000 and 16384 are completely arbitrary, and can be
+ * moved if we run low on OIDs in either category.  Changing the macros below
  * should be sufficient to do this.
  *
  * NOTE: if the OID generator wraps around, we skip over OIDs 0-16383
@@ -90,7 +93,8 @@
  * reassigning OIDs that might have been assigned during initdb.
  * ----------
  */
-#define FirstBootstrapObjectId	10000
+#define FirstGenbkiObjectId		10000
+#define FirstBootstrapObjectId	12000
 #define FirstNormalObjectId		16384
 
 /*
diff --git a/src/include/catalog/unused_oids b/src/include/catalog/unused_oids
index c5fc378ae34..b5cafe0f3d3 100755
--- a/src/include/catalog/unused_oids
+++ b/src/include/catalog/unused_oids
@@ -32,11 +32,11 @@ my @input_files = (glob("pg_*.h"), qw(indexing.h toasting.h));
 
 my $oids = Catalog::FindAllOidsFromHeaders(@input_files);
 
-# Also push FirstBootstrapObjectId to serve as a terminator for the last gap.
-my $FirstBootstrapObjectId =
+# Also push FirstGenbkiObjectId to serve as a terminator for the last gap.
+my $FirstGenbkiObjectId =
   Catalog::FindDefinedSymbol('access/transam.h', '..',
-	'FirstBootstrapObjectId');
-push @{$oids}, $FirstBootstrapObjectId;
+	'FirstGenbkiObjectId');
+push @{$oids}, $FirstGenbkiObjectId;
 
 my $prev_oid = 0;
 foreach my $oid (sort { $a <=> $b } @{$oids})
diff --git a/src/include/utils/fmgrtab.h b/src/include/utils/fmgrtab.h
index 6122ed3e978..72671c38296 100644
--- a/src/include/utils/fmgrtab.h
+++ b/src/include/utils/fmgrtab.h
@@ -41,6 +41,6 @@ extern const int fmgr_nbuiltins;	/* number of entries in table */
  * array.
  */
 #define InvalidOidBuiltinMapping PG_UINT16_MAX
-extern const uint16 fmgr_builtin_oid_index[FirstBootstrapObjectId];
+extern const uint16 fmgr_builtin_oid_index[FirstGenbkiObjectId];
 
 #endif							/* FMGRTAB_H */
#10John Naylor
jcnaylor@gmail.com
In reply to: Andres Freund (#9)
Re: automatically assigning catalog toast oids

On 12/11/18, Andres Freund <andres@anarazel.de> wrote:

I've attached a patch implementing that. I'm not particularly in love
with FirstGenbkiObjectId as the symbol, but I couldn't think of
something more descriptive.

How about FirstAutoAssignedObjectId?

-John Naylor

#11Andres Freund
andres@anarazel.de
In reply to: John Naylor (#10)
Re: automatically assigning catalog toast oids

Hi,

On 2018-12-11 19:33:23 -0500, John Naylor wrote:

On 12/11/18, Andres Freund <andres@anarazel.de> wrote:

I've attached a patch implementing that. I'm not particularly in love
with FirstGenbkiObjectId as the symbol, but I couldn't think of
something more descriptive.

How about FirstAutoAssignedObjectId?

That sounds too general to me. Could just as well be what
FirstNormalObjectId is, or what FirstBootstrapObjectId is. So between
those I do prefer FirstGenbkiObjectId, as that's clearer. Could go for
FirstGenbkiAssignedObjectId too, but that's probably a bit redundant.

Greetings,

Andres Freund

#12Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#9)
Re: automatically assigning catalog toast oids

On 2018-12-11 15:08:02 -0800, Andres Freund wrote:

Hi,

On 2018-12-09 18:43:14 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2018-12-09 17:14:42 -0500, Tom Lane wrote:

Well, that's just a different very-easily-broken assumption. There are
a lot of things that make auto-assigned OIDs unstable, and I do not think
that we want to guarantee that they'll hold still across a release series.

Why wouldn't they be for genbki (rather than initdb) assigned oids? I
don't think it's reasonable to add new functions or such post release
that would have move oid assignments for other objects?

As you've got this set up, we couldn't change *anything* for fear of
it moving auto-assignments; there's no isolation between catalogs.

But there wasn't any previously either?

Another thing I seriously dislike is that this allows people to omit OIDs
from .dat entries in catalogs where we traditionally hand-assign OIDs.

That's not new, is it? Sure, now genbki.pl assigns the oid, but
previously it'd just have been heap_insert()? bootparse.y/bootstrap.c
never enforced that oids are assigned for tables that have oids.

That's not a good idea because it would mean those entries don't have
stable OIDs, whereas the whole point of hand assignment is to ensure
all built-in objects of a particular type have stable OIDs. Now, you
could argue about the usefulness of that policy for any given catalog;
but if we decide that catalog X doesn't need stable OIDs then that should
be an intentional policy change, not something that can happen because
one lazy hacker didn't follow the policy.

I think we should change that policy, but I also think that there wasn't
any meaningful "assignment policy" change in what I did. So that just
seems like a separate argument.

Note that changing that for "prominent" catalogs would be a bit more
work than just changing the policy, as we'd need to assign oids before
the lookup tables are built - although the current behaviour would kind
of allow us to implement the "not crazy" policy of allowing
auto-assignment as long as the object isn't referenced; but via an imo
fairly opaque mechanism.

I'm fine with adding a distinct range, the earlier version of the patch
had that. I'd asked for comments if anybody felt a need to keep that,
nobody replied... I alternatively proposed that we could just start at
FirstNormalObjectId for those and update the server's oid start value to
the maximum genbki assigned oid. Do you have preferences around that?

Yeah, I thought about the latter as well. But it adds complexity to the
bootstrap process and makes it harder to tell what assigned a particular
OID, so I'd rather go with the former, at least until the OID situation
gets too tight to allow for daylight between the ranges.

Yea, it doesn't seem perfect, that's basically why I didn't go for it
last time.

It looks to me like as of HEAD, genbki.pl is auto-assigning about 1470
OIDs. Meanwhile, on my RHEL6 machine, initdb is auto-assigning about
1740 OIDs (what a coincidence); of those, 872 are collation entries
that are absorbed from the system environment. So the second number is
likely to vary a lot from platform to platform. (I don't have ICU
enabled; I wonder how many that typically adds.)

I'd be inclined to allow say 2000 OIDs for genbki.pl, with 4384 therefore
available for initdb. We could expect to have to raise the boundary
from time to time, but not very often.

I've attached a patch implementing that. I'm not particularly in love
with FirstGenbkiObjectId as the symbol, but I couldn't think of
something more descriptive.

I changed the length of fmgr_builtin_oid_index to FirstGenbkiObjectId -
until we allow pg_proc oids to be auto-assigned that'd just be wasted
memory otherwise?

I did *not* change record_plan_function_dependency(), it seems correct
that it doesn't track genbki assigned oids, they certainly can't change
while a server is running. But I'm not entirely clear to why that's not
using FirstNormalObjectId as the cutoff, so perhaps I'm missing
something. Similar with logic in predicate.c.

I did however change postgres_fdw's is_builtin(), as that says:
/*
* Return true if given object is one of PostgreSQL's built-in objects.
*
- * We use FirstBootstrapObjectId as the cutoff, so that we only consider
+ * We use FirstGenbkiObjectId as the cutoff, so that we only consider
* objects with hand-assigned OIDs to be "built in", not for instance any
* function or type defined in the information_schema.
*
@@ -154,7 +154,7 @@ lookup_shippable(Oid objectId, Oid classId, PgFdwRelationInfo *fpinfo)

and >= FirstGenbkiObjectId would not be maniually assigned.

I added a throwaway "with 9000-9999 tentatively reserved for forks." to
transam.h, but I'm not sure we really want that, or whether that's good
wording.

I've pushed it like that, after blindly attempting to satisfy
Solution.pm...

- Andres

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#12)
Re: automatically assigning catalog toast oids

Andres Freund <andres@anarazel.de> writes:

[ separation of FirstBootstrapObjectId and FirstGenbkiObjectId ]

It seems like we should also add a check to genbki.pl that the ending
value of GenbkiNextOid is <= FirstBootstrapObjectId, so that we'll
certainly notice when it's necessary to adjust those boundaries.

(There's not a similar check for whether initdb runs past
FirstNormalObjectId, but I think that's fine: the system should cope
with hitting those OIDs again after a wraparound. Also, because of our
uncertainty about just how many OIDs will be consumed for collations,
it'd be a bad idea to ship code with a hard constraint on that boundary
not being overrun.)

I did *not* change record_plan_function_dependency(), it seems correct
that it doesn't track genbki assigned oids, they certainly can't change
while a server is running. But I'm not entirely clear to why that's not
using FirstNormalObjectId as the cutoff, so perhaps I'm missing
something. Similar with logic in predicate.c.

It's not about whether they can change whether the server is running,
it's about whether they're pinned. OIDs above 10000 might or might
not be pinned; we shouldn't hard-wire assumptions about that. So
I suspect you made the wrong choice there.

BTW, this ties into something that was bugging me this afternoon while
looking at dependencies on the default collation: there's a bunch of
places that special-case DEFAULT_COLLATION_OID to skip adding dependencies
on it, for instance this in dependency.c:

/*
* We must also depend on the constant's collation: it could be
* different from the datatype's, if a CollateExpr was const-folded to
* a simple constant. However we can save work in the most common
* case where the collation is "default", since we know that's pinned.
*/
if (OidIsValid(con->constcollid) &&
con->constcollid != DEFAULT_COLLATION_OID)
add_object_address(OCLASS_COLLATION, con->constcollid, 0,
context->addrs);

I'm pretty sure that that special case is my fault, added in perhaps
over-eagerness to minimize the cost added by the collations feature.
Looking at it now, it seems like mostly a wart. But perhaps there is
a more general principle here: why don't we just hard-wire an assumption
that all OIDs below FirstGenbkiObjectId are pinned? I'm thinking of
getting rid of the DEFAULT_COLLATION_OID special cases in dependency-
recording logic and instead having someplace central like isObjectPinned
just assume that such OIDs are pinned, so that we don't bother to
consult or update pg_depend for them.

We could take it a bit further and not make the 'p' entries for such
OIDs in the first place, greatly reducing the size of pg_depend in
typical installations. Perhaps that'd confuse clients that look at
that catalog, though.

Looking at the existing entries, it seems like we'd have to have
one special case: schema public has OID 2200 but is intentionally
not pinned. I wouldn't have a hard time with teaching isObjectPinned
about that; though if it turns out that many places need to know
about it, maybe it's not workable.

Thoughts?

regards, tom lane

#14John Naylor
jcnaylor@gmail.com
In reply to: Tom Lane (#13)
Re: automatically assigning catalog toast oids

On 12/13/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Looking at the existing entries, it seems like we'd have to have
one special case: schema public has OID 2200 but is intentionally
not pinned.

setup_depend() mentions other exceptions, but only this one currently has any effect as far as I can see:

"pg_database: it's a feature, not a bug, that template1 is not pinned."

-John Naylor

#15Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#13)
Re: automatically assigning catalog toast oids

Hi,

On 2018-12-13 20:21:12 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

[ separation of FirstBootstrapObjectId and FirstGenbkiObjectId ]

It seems like we should also add a check to genbki.pl that the ending
value of GenbkiNextOid is <= FirstBootstrapObjectId, so that we'll
certainly notice when it's necessary to adjust those boundaries.

Hm, if we go there, it'd probably also good to check for <=
FirstGenbkiObjectId?

I did *not* change record_plan_function_dependency(), it seems correct
that it doesn't track genbki assigned oids, they certainly can't change
while a server is running. But I'm not entirely clear to why that's not
using FirstNormalObjectId as the cutoff, so perhaps I'm missing
something. Similar with logic in predicate.c.

It's not about whether they can change whether the server is running,
it's about whether they're pinned. OIDs above 10000 might or might
not be pinned; we shouldn't hard-wire assumptions about that. So
I suspect you made the wrong choice there.

Hm, but aren't pins setup by initdb's setup_depend()? IOW, genbki
assigned oids will be in there? Am I missing something?

BTW, this ties into something that was bugging me this afternoon while
looking at dependencies on the default collation: there's a bunch of
places that special-case DEFAULT_COLLATION_OID to skip adding dependencies
on it, for instance this in dependency.c:

/*
* We must also depend on the constant's collation: it could be
* different from the datatype's, if a CollateExpr was const-folded to
* a simple constant. However we can save work in the most common
* case where the collation is "default", since we know that's pinned.
*/
if (OidIsValid(con->constcollid) &&
con->constcollid != DEFAULT_COLLATION_OID)
add_object_address(OCLASS_COLLATION, con->constcollid, 0,
context->addrs);

I'm pretty sure that that special case is my fault, added in perhaps
over-eagerness to minimize the cost added by the collations feature.
Looking at it now, it seems like mostly a wart. But perhaps there is
a more general principle here: why don't we just hard-wire an assumption
that all OIDs below FirstGenbkiObjectId are pinned? I'm thinking of
getting rid of the DEFAULT_COLLATION_OID special cases in dependency-
recording logic and instead having someplace central like isObjectPinned
just assume that such OIDs are pinned, so that we don't bother to
consult or update pg_depend for them.

We could take it a bit further and not make the 'p' entries for such
OIDs in the first place, greatly reducing the size of pg_depend in
typical installations. Perhaps that'd confuse clients that look at
that catalog, though.

I think that'd be good, it's the second largest table in a new cluster,
and it's not going to get smaller. 'p' already is somewhat of a special
case, so I'm not particulary concerned with clients having to understand
that.

Greetings,

Andres Freund

#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#13)
Re: automatically assigning catalog toast oids

On 2018-Dec-13, Tom Lane wrote:

We could take it a bit further and not make the 'p' entries for such
OIDs in the first place, greatly reducing the size of pg_depend in
typical installations. Perhaps that'd confuse clients that look at
that catalog, though.

The number of 'p' entries in pg_depend is often annoying when manually
querying it. I support the idea of special-casing such entries.

Looking at the existing entries, it seems like we'd have to have
one special case: schema public has OID 2200 but is intentionally
not pinned. I wouldn't have a hard time with teaching isObjectPinned
about that; though if it turns out that many places need to know
about it, maybe it's not workable.

Why not just move that OID outside the Genbki special range? I have
seen quite a few installs where schema public was removed and later
re-added. I've never seen a query hardcode OID 2200, and I'd call one
which does buggy.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#17Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#16)
Re: automatically assigning catalog toast oids

Hi,

On 2018-12-14 22:54:14 -0300, Alvaro Herrera wrote:

On 2018-Dec-13, Tom Lane wrote:

Looking at the existing entries, it seems like we'd have to have
one special case: schema public has OID 2200 but is intentionally
not pinned. I wouldn't have a hard time with teaching isObjectPinned
about that; though if it turns out that many places need to know
about it, maybe it's not workable.

Why not just move that OID outside the Genbki special range? I have
seen quite a few installs where schema public was removed and later
re-added. I've never seen a query hardcode OID 2200, and I'd call one
which does buggy.

+1

Greetings,

Andres Freund

#18John Naylor
jcnaylor@gmail.com
In reply to: Andres Freund (#15)
1 attachment(s)
Re: automatically assigning catalog toast oids

On 12/15/18, Andres Freund <andres@anarazel.de> wrote:

[ separation of FirstBootstrapObjectId and FirstGenbkiObjectId ]

I just noticed a small typo in transam.h. Patch attached.

-John Naylor

Attachments:

fix-comment-about-FirstBootstrapObjectId.patchtext/x-patch; charset=US-ASCII; name=fix-comment-about-FirstBootstrapObjectId.patchDownload
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 4a7eab00d8..444be4aeb5 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -74,7 +74,7 @@
  *		OIDs 1-9999 are reserved for manual assignment (see .dat files in
  *		src/include/catalog/), with 9000-9999 tentatively reserved for forks.
  *
- *		OIDs 10000-12000 are reserved for assignment by genbki.pl, when the
+ *		OIDs 10000-11999 are reserved for assignment by genbki.pl, when the
  *		.dat files in src/include/catalog/ do not specify oids.
  *
  *		OIDS 12000-16383 are reserved for assignment during initdb
#19Michael Paquier
michael@paquier.xyz
In reply to: John Naylor (#18)
Re: automatically assigning catalog toast oids

On Sat, Feb 16, 2019 at 04:50:51PM +0100, John Naylor wrote:

I just noticed a small typo in transam.h. Patch attached.

Thanks, fixed.
--
Michael

#20Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#19)
Re: automatically assigning catalog toast oids

On 2019-02-18 12:45:10 +0900, Michael Paquier wrote:

On Sat, Feb 16, 2019 at 04:50:51PM +0100, John Naylor wrote:

I just noticed a small typo in transam.h. Patch attached.

Thanks, fixed.

Thx.