generating bootstrap entries for array types

Started by John Nayloralmost 8 years ago10 messageshackers
Jump to latest
#1John Naylor
john.naylor@enterprisedb.com

I wrote [1]/messages/by-id/CAJVSVGW-D7OobzU=dybVT2JqZAx-4X1yvBJdavBmqQL05Q6CLw@mail.gmail.com:

On 4/26/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

if I counted correctly. (Array entries should be ignored for this
purpose; maybe we'll autogenerate them someday.)

Hmm, that wouldn't be too hard. Add a new metadata field called
'array_type_oid', then if it finds such an OID, teach genbki.pl to
construct a tuple for the corresponding array type by consulting
something like

char typcategory BKI_ARRAY(A);
char typstorage BKI_ARRAY(x);
...etc

in the header file, plus copying typalign from the element type. I'll
whip this up sometime and add it to the next commitfest.

This turned out to be slightly more complicated than that, but still
pretty straightforward. The script is for information only, it doesn't
need to be run.

-typalign for arrays is 'i' unless the element type is 'd', in which
case it's 'd'.
-If future array-like types are created that break the model and so
can't be generated from the element type, they have an escape hatch of
having their own full entry. Currently this includes _record, since
it's a pseudotype, and of course the special types oidvector and
int2vector.
-I've kept the current convention of duplicating typdelim in the array
type, although I'm not sure it's used anywhere.

If you sort the before and after postgres.bki, it should result in a clean diff.

Will add to next commitfest.

--
[1]: /messages/by-id/CAJVSVGW-D7OobzU=dybVT2JqZAx-4X1yvBJdavBmqQL05Q6CLw@mail.gmail.com

-John Naylor

Attachments:

v1-0001-Generate-bootstrap-entries-for-array-types.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Generate-bootstrap-entries-for-array-types.patchDownload+153-407
v1-remove_array_types.plapplication/x-perl; name=v1-remove_array_types.plDownload
In reply to: John Naylor (#1)
Re: generating bootstrap entries for array types

Hi John,

John Naylor <jcnaylor@gmail.com> writes:

On 4/26/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

if I counted correctly. (Array entries should be ignored for this
purpose; maybe we'll autogenerate them someday.)

Hmm, that wouldn't be too hard. Add a new metadata field called
'array_type_oid', then if it finds such an OID, teach genbki.pl to
construct a tuple for the corresponding array type by consulting
something like

char typcategory BKI_ARRAY(A);
char typstorage BKI_ARRAY(x);
...etc

in the header file, plus copying typalign from the element type. I'll
whip this up sometime and add it to the next commitfest.

This turned out to be slightly more complicated than that, but still
pretty straightforward.

Doing this in genbki.pl makes DBD::Pg lose its array type information,
since it uses Catalog::ParseData() to get it
(https://github.com/bucardo/dbdpg/blob/master/types.c#L439). May I
suggest moving gen_array_types() to Catalog.pm and calling it from
ParseData() if the input file is pg_type.dat, so that it always returns
complete data?

Thanks,

- ilmari
--
"A disappointingly low fraction of the human race is,
at any given time, on fire." - Stig Sandbeck Mathisen

In reply to: Dagfinn Ilmari Mannsåker (#2)
Re: generating bootstrap entries for array types

Hi again John,

ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes:

Doing this in genbki.pl makes DBD::Pg lose its array type information,
since it uses Catalog::ParseData() to get it
(https://github.com/bucardo/dbdpg/blob/master/types.c#L439). May I
suggest moving gen_array_types() to Catalog.pm and calling it from
ParseData() if the input file is pg_type.dat, so that it always returns
complete data?

Attached is a proposed revision of this patch that does the above. It
passes check-world, reformat_dat_file.pl still works, and DBD::Pg is
still able to get the array type information.

Thanks,

- ilmari
--
"I use RMS as a guide in the same way that a boat captain would use
a lighthouse. It's good to know where it is, but you generally
don't want to find yourself in the same spot." - Tollef Fog Heen

Attachments:

v2-0001-Generate-bootstrap-entries-for-array-types.patchtext/x-diffDownload+154-406
In reply to: Dagfinn Ilmari Mannsåker (#3)
Re: generating bootstrap entries for array types

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: tested, passed

The v2 patch is IMO ready to commit.

The documentation in the form of source comments is updated,
and having Catalog::ParseData() do the expansion instead of
genbki.pl is more in line with what bki.sgml says:

If you want to add a new method of making the data representation
smaller, you must implement it
in <filename>reformat_dat_file.pl</filename> and also
teach <function>Catalog::ParseData()</function> how to expand the
data back into the full representation.

- ilmari

The new status of this patch is: Ready for Committer

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dagfinn Ilmari Mannsåker (#3)
Re: generating bootstrap entries for array types

ilmari@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes:

ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes:

Doing this in genbki.pl makes DBD::Pg lose its array type information,
since it uses Catalog::ParseData() to get it
(https://github.com/bucardo/dbdpg/blob/master/types.c#L439). May I
suggest moving gen_array_types() to Catalog.pm and calling it from
ParseData() if the input file is pg_type.dat, so that it always returns
complete data?

Attached is a proposed revision of this patch that does the above. It
passes check-world, reformat_dat_file.pl still works, and DBD::Pg is
still able to get the array type information.

I find the way you did that (making the call dependent on
$preserve_formatting) to be a mighty ugly kluge. It causes ParseData
to know far more than it should about what callers are likely to do with
the data.

I'm inclined to think the right way is to do the expansion always,
and teach reformat_dat_file.pl to drop autogenerated array types
on the way out, making this work more like the existing facilities
for default/computed fields.

The easiest way to make that happen seems to be to invent another, purely
internal metadata field, along the lines of "is_autogenerated_entry".
Fooling with that now ...

regards, tom lane

In reply to: Tom Lane (#5)
Re: generating bootstrap entries for array types

Tom Lane <tgl@sss.pgh.pa.us> writes:

ilmari@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes:

ilmari@ilmari.org (Dagfinn Ilmari Mannsåker) writes:

Doing this in genbki.pl makes DBD::Pg lose its array type information,
since it uses Catalog::ParseData() to get it
(https://github.com/bucardo/dbdpg/blob/master/types.c#L439). May I
suggest moving gen_array_types() to Catalog.pm and calling it from
ParseData() if the input file is pg_type.dat, so that it always returns
complete data?

Attached is a proposed revision of this patch that does the above. It
passes check-world, reformat_dat_file.pl still works, and DBD::Pg is
still able to get the array type information.

I find the way you did that (making the call dependent on
$preserve_formatting) to be a mighty ugly kluge. It causes ParseData
to know far more than it should about what callers are likely to do with
the data.

I'm inclined to think the right way is to do the expansion always,
and teach reformat_dat_file.pl to drop autogenerated array types
on the way out, making this work more like the existing facilities
for default/computed fields.

The easiest way to make that happen seems to be to invent another, purely
internal metadata field, along the lines of "is_autogenerated_entry".
Fooling with that now ...

Something like this, on top of the v2 patch?

diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
index 12b142928a..7286c5b673 100644
--- a/src/backend/catalog/Catalog.pm
+++ b/src/backend/catalog/Catalog.pm
@@ -297,7 +297,7 @@ sub ParseData
 	# Generate array types unless we're just reformatting the file
 	Catalog::GenArrayTypes($schema, $data)
-		if $catname eq 'pg_type' and !$preserve_formatting;
+		if $catname eq 'pg_type';
 	return $data;
 }

@@ -485,6 +485,7 @@ sub GenArrayTypes

foreach my $elem_type (@$types)
{
+ next if ref $elem_type ne 'HASH';
next if !exists $elem_type->{array_type_oid};
my %array_type;

@@ -493,6 +494,9 @@ sub GenArrayTypes
$array_type{typname} = '_' . $elem_type->{typname};
$array_type{typelem} = $elem_type->{typname};

+		# Stops this entry being output when reformatting the .dat files
+		$array_type{is_auto_generated_entry} = 1;
+
 		# Arrays require INT alignment, unless the element type requires
 		# DOUBLE alignment.
 		$array_type{typalign} = $elem_type->{typalign} eq 'd' ? 'd' : 'i';
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index 053d1ee00d..3859d46043 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -396,7 +396,8 @@ EOM
 			  || $key eq "oid_symbol"
 			  || $key eq "descr"
 			  || $key eq "line_number"
-			  || $key eq "array_type_oid";
+			  || $key eq "array_type_oid"
+			  || $key eq "is_auto_generated_entry";
 			die sprintf "unrecognized field name \"%s\" in %s.dat line %s\n",
 			  $key, $catname, $bki_values{line_number}
 			  if (!exists($attnames{$key}));
diff --git a/src/include/catalog/reformat_dat_file.pl b/src/include/catalog/reformat_dat_file.pl
index 24d9d18637..78c0f3e103 100755
--- a/src/include/catalog/reformat_dat_file.pl
+++ b/src/include/catalog/reformat_dat_file.pl
@@ -132,6 +132,7 @@ foreach my $catname (@catnames)
 		if (ref $data eq 'HASH')
 		{
 			my %values = %$data;
+			next if $values{is_auto_generated_entry};

############################################################
# At this point we have the full tuple in memory as a hash

- ilmari
--
"The surreality of the universe tends towards a maximum" -- Skud's Law
"Never formulate a law or axiom that you're not prepared to live with
the consequences of." -- Skud's Meta-Law

#7John Naylor
john.naylor@enterprisedb.com
In reply to: Dagfinn Ilmari Mannsåker (#3)
Re: generating bootstrap entries for array types

On 9/20/18, Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> wrote:

Hi again John,

Hi Ilmari,
My apologies -- your messages ended up in my spam folder. I only saw
this thread again when Tom replied. Thanks for the review! I'll keep a
look out for any future messages.

-John Naylor

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dagfinn Ilmari Mannsåker (#6)
Re: generating bootstrap entries for array types

ilmari@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes:

Tom Lane <tgl@sss.pgh.pa.us> writes:

I'm inclined to think the right way is to do the expansion always,
and teach reformat_dat_file.pl to drop autogenerated array types
on the way out, making this work more like the existing facilities
for default/computed fields.

The easiest way to make that happen seems to be to invent another, purely
internal metadata field, along the lines of "is_autogenerated_entry".
Fooling with that now ...

Something like this, on top of the v2 patch?

Yeah, that's pretty close to what I came up with, except that I thought
it'd be good if "reformat_dat_file.pl --full-tuples" would print
autogenerated entries; seems useful for debug purposes if nothing else.
So that requires also teaching ParseData to ignore autogenerated entries
on read-in, else you end up with duplicates.

I did some other minor hacking (mostly, fixing the documentation)
and pushed it.

regards, tom lane

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#8)
Re: generating bootstrap entries for array types

I wrote:

I did some other minor hacking (mostly, fixing the documentation)
and pushed it.

BTW, I hadn't thought to measure before, but this patch cut the
size of pg_type.dat by almost 40%. Nice!

regards, tom lane

#10John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#8)
Re: generating bootstrap entries for array types

On 9/21/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I did some other minor hacking (mostly, fixing the documentation)
and pushed it.

Thank you.

-John Naylor