tablecmds.c/MergeAttributes() cleanup
The MergeAttributes() and related code in and around tablecmds.c is huge
and ancient, with many things bolted on over time, and difficult to deal
with. I took some time to make careful incremental updates and
refactorings to make the code easier to follow, more compact, and more
modern in appearance. I also found several pieces of obsolete code
along the way. This resulted in the attached long patch series. Each
patch tries to make a single change and can be considered incrementally.
At the end, the code is shorter, better factored, and I hope easier to
understand. There shouldn't be any change in behavior.
Attachments:
0001-Remove-obsolete-comment-about-OID-support.patchtext/plain; charset=UTF-8; name=0001-Remove-obsolete-comment-about-OID-support.patchDownload+3-4
0002-Remove-ancient-special-case-code-for-adding-oid-colu.patchtext/plain; charset=UTF-8; name=0002-Remove-ancient-special-case-code-for-adding-oid-colu.patchDownload+3-5
0003-Remove-ancient-special-case-code-for-dropping-oid-co.patchtext/plain; charset=UTF-8; name=0003-Remove-ancient-special-case-code-for-dropping-oid-co.patchDownload+43-57
0004-Make-more-use-of-makeColumnDef.patchtext/plain; charset=UTF-8; name=0004-Make-more-use-of-makeColumnDef.patchDownload+16-65
0005-Clean-up-MergeAttributesIntoExisting.patchtext/plain; charset=UTF-8; name=0005-Clean-up-MergeAttributesIntoExisting.patchDownload+42-74
0006-Clean-up-MergeCheckConstraint.patchtext/plain; charset=UTF-8; name=0006-Clean-up-MergeCheckConstraint.patchDownload+21-27
0007-MergeAttributes-and-related-variable-renaming.patchtext/plain; charset=UTF-8; name=0007-MergeAttributes-and-related-variable-renaming.patchDownload+59-65
0008-Improve-some-catalog-documentation.patchtext/plain; charset=UTF-8; name=0008-Improve-some-catalog-documentation.patchDownload+8-6
0009-Remove-useless-if-condition.patchtext/plain; charset=UTF-8; name=0009-Remove-useless-if-condition.patchDownload+2-6
0010-Remove-useless-if-condition.patchtext/plain; charset=UTF-8; name=0010-Remove-useless-if-condition.patchDownload+1-6
0011-Refactor-ATExecAddColumn-to-use-BuildDescForRelation.patchtext/plain; charset=UTF-8; name=0011-Refactor-ATExecAddColumn-to-use-BuildDescForRelation.patchDownload+31-67
0012-Push-attidentity-and-attgenerated-handling-into-Buil.patchtext/plain; charset=UTF-8; name=0012-Push-attidentity-and-attgenerated-handling-into-Buil.patchDownload+2-5
0013-Move-BuildDescForRelation-from-tupdesc.c-to-tablecmd.patchtext/plain; charset=UTF-8; name=0013-Move-BuildDescForRelation-from-tupdesc.c-to-tablecmd.patchDownload+105-111
0014-Push-attcompression-and-attstorage-handling-into-Bui.patchtext/plain; charset=UTF-8; name=0014-Push-attcompression-and-attstorage-handling-into-Bui.patchDownload+5-15
0015-Add-TupleDescGetDefault.patchtext/plain; charset=UTF-8; name=0015-Add-TupleDescGetDefault.patchDownload+32-42
0016-MergeAttributes-convert-pg_attribute-back-to-ColumnD.patchtext/plain; charset=UTF-8; name=0016-MergeAttributes-convert-pg_attribute-back-to-ColumnD.patchDownload+93-87
0017-Add-some-const-decorations.patchtext/plain; charset=UTF-8; name=0017-Add-some-const-decorations.patchDownload+2-3
On 2023-Jun-28, Peter Eisentraut wrote:
The MergeAttributes() and related code in and around tablecmds.c is huge and
ancient, with many things bolted on over time, and difficult to deal with.
I took some time to make careful incremental updates and refactorings to
make the code easier to follow, more compact, and more modern in appearance.
I also found several pieces of obsolete code along the way. This resulted
in the attached long patch series. Each patch tries to make a single change
and can be considered incrementally. At the end, the code is shorter,
better factored, and I hope easier to understand. There shouldn't be any
change in behavior.
I request to leave this alone for now. I have enough things to juggle
with in the NOTNULLs patch; this patchset looks like it will cause me
messy merge conflicts. 0004 for instance looks problematic, as does
0007 and 0016.
FWIW for the most part that patch is working and I intend to re-submit
shortly, but the relevant pg_upgrade code is really brittle, so it's
taken me much more than I expected to get it in good shape for all
cases.
Thanks
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On 2023-Jun-28, Peter Eisentraut wrote:
The MergeAttributes() and related code in and around tablecmds.c is huge and
ancient, with many things bolted on over time, and difficult to deal with.
I took some time to make careful incremental updates and refactorings to
make the code easier to follow, more compact, and more modern in appearance.
I also found several pieces of obsolete code along the way. This resulted
in the attached long patch series. Each patch tries to make a single change
and can be considered incrementally. At the end, the code is shorter,
better factored, and I hope easier to understand. There shouldn't be any
change in behavior.
I spent a few minutes doing a test merge of this to my branch with NOT
NULL changes. Here's a quick review.
Subject: [PATCH 01/17] Remove obsolete comment about OID support
Obvious, trivial. +1
Subject: [PATCH 02/17] Remove ancient special case code for adding oid columns
LGTM; deletes dead code.
Subject: [PATCH 03/17] Remove ancient special case code for dropping oid
columns
Hmm, interesting. Yay for more dead code removal. Didn't verify it.
Subject: [PATCH 04/17] Make more use of makeColumnDef()
Good idea, +1. Some lines (almost all makeColumnDef callsites) end up
too long. This is the first patch that actually conflicts with the NOT
NULLs one, and the conflicts are easy to solve, so I won't complain if
you want to get it pushed soon.
Subject: [PATCH 05/17] Clean up MergeAttributesIntoExisting()
I don't disagree with this in principle, but this one has more
conflicts than the previous ones.
Subject: [PATCH 06/17] Clean up MergeCheckConstraint()
Looks a reasonable change as far as this patch goes.
However, reading it I noticed that CookedConstraint->inhcount is int
and is tested for wraparound, but pg_constraint.coninhcount is int16.
This is probably bogus already. ColumnDef->inhcount is also int. These
should be narrowed to match the catalog definitions.
Subject: [PATCH 07/17] MergeAttributes() and related variable renaming
I think this makes sense, but there's a bunch of conflicts to NOT NULLs.
I guess we can come back to this one later.
Subject: [PATCH 08/17] Improve some catalog documentation
Point out that typstorage and attstorage are never '\0', even for
fixed-length types. This is different from attcompression. For this
reason, some of the handling of these columns in tablecmds.c etc. is
different. (catalogs.sgml already contained this information in an
indirect way.)
I don't understand why we must point out that they're never '\0'. I
mean, if we're doing that, why not say that they can never be \xFF?
The valid values are already listed. The other parts of this patch look
okay.
Subject: [PATCH 09/17] Remove useless if condition
This is useless because these fields are not set anywhere before, so
we can assign them unconditionally. This also makes this more
consistent with ATExecAddColumn().
Makes sense.
Subject: [PATCH 10/17] Remove useless if condition
We can call GetAttributeCompression() with a NULL argument. It
handles that internally already. This change makes all the callers of
GetAttributeCompression() uniform.
I agree, +1.
From 2eda6bc9897d0995a5112e2851c51daf0c35656e Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 14 Jun 2023 17:51:31 +0200
Subject: [PATCH 11/17] Refactor ATExecAddColumn() to use
BuildDescForRelation()BuildDescForRelation() has all the knowledge for converting a
ColumnDef into pg_attribute/tuple descriptor. ATExecAddColumn() can
make use of that, instead of duplicating all that logic. We just pass
a one-element list of ColumnDef and we get back exactly the data
structure we need. Note that we don't even need to touch
BuildDescForRelation() to make this work.
Hmm, crazy. I'm not sure I like this, because it seems much too clever.
The number of lines that are deleted is alluring, though.
Maybe it'd be better to create a separate routine that takes a single
ColumnDef and returns the Form_pg_attribute element for it; then use
that in both BuildDescForRelation and ATExecAddColumn.
Subject: [PATCH 12/17] Push attidentity and attgenerated handling into
BuildDescForRelation()Previously, this was handled by the callers separately, but it can be
trivially moved into BuildDescForRelation() so that it is handled in a
central place.
Looks reasonable.
I think the last few patches are the more innovative (interesting,
useful) of the bunch. Let's get the first few out of the way.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
On 11.07.23 20:17, Alvaro Herrera wrote:
I spent a few minutes doing a test merge of this to my branch with NOT
NULL changes. Here's a quick review.Subject: [PATCH 01/17] Remove obsolete comment about OID support
Obvious, trivial. +1
Subject: [PATCH 02/17] Remove ancient special case code for adding oid columns
LGTM; deletes dead code.
Subject: [PATCH 03/17] Remove ancient special case code for dropping oid
columnsHmm, interesting. Yay for more dead code removal. Didn't verify it.
I have committed these first three. I'll leave it at that for now.
Subject: [PATCH 08/17] Improve some catalog documentation
Point out that typstorage and attstorage are never '\0', even for
fixed-length types. This is different from attcompression. For this
reason, some of the handling of these columns in tablecmds.c etc. is
different. (catalogs.sgml already contained this information in an
indirect way.)I don't understand why we must point out that they're never '\0'. I
mean, if we're doing that, why not say that they can never be \xFF?
The valid values are already listed. The other parts of this patch look
okay.
While working through the storage and compression handling, which look
similar, I was momentarily puzzled by this. While attcompression can be
0 to mean, use default, this is not possible/allowed for attstorage, but
it took looking around three corners to verify this. It could be more
explicit, I thought.
On 12.07.23 16:29, Peter Eisentraut wrote:
On 11.07.23 20:17, Alvaro Herrera wrote:
I spent a few minutes doing a test merge of this to my branch with NOT
NULL changes. Here's a quick review.Subject: [PATCH 01/17] Remove obsolete comment about OID support
Obvious, trivial. +1
Subject: [PATCH 02/17] Remove ancient special case code for adding
oid columnsLGTM; deletes dead code.
Subject: [PATCH 03/17] Remove ancient special case code for dropping oid
columnsHmm, interesting. Yay for more dead code removal. Didn't verify it.
I have committed these first three. I'll leave it at that for now.
I have committed a few more patches from this series that were already
agreed upon. The remaining ones are rebased and reordered a bit, attached.
There was some doubt about the patch "Refactor ATExecAddColumn() to use
BuildDescForRelation()" (now 0009), whether it's too clever to build a
fake one-item tuple descriptor. I am working on another patch, which I
hope to post this week, that proposes to replace the use of tuple
descriptors there with a List of something. That would then allow maybe
rewriting this in a less-clever way. That patch incidentally also wants
to move BuildDescForRelation from tupdesc.c to tablecmds.c (patch 0007
here).
Attachments:
v2-0010-MergeAttributes-convert-pg_attribute-back-to-Colu.patchtext/plain; charset=UTF-8; name=v2-0010-MergeAttributes-convert-pg_attribute-back-to-Colu.patchDownload+94-88
v2-0001-Clean-up-MergeAttributesIntoExisting.patchtext/plain; charset=UTF-8; name=v2-0001-Clean-up-MergeAttributesIntoExisting.patchDownload+48-76
v2-0002-Clean-up-MergeCheckConstraint.patchtext/plain; charset=UTF-8; name=v2-0002-Clean-up-MergeCheckConstraint.patchDownload+21-27
v2-0003-MergeAttributes-and-related-variable-renaming.patchtext/plain; charset=UTF-8; name=v2-0003-MergeAttributes-and-related-variable-renaming.patchDownload+59-65
v2-0004-Add-TupleDescGetDefault.patchtext/plain; charset=UTF-8; name=v2-0004-Add-TupleDescGetDefault.patchDownload+32-42
v2-0005-Improve-some-catalog-documentation.patchtext/plain; charset=UTF-8; name=v2-0005-Improve-some-catalog-documentation.patchDownload+8-6
v2-0006-Push-attidentity-and-attgenerated-handling-into-B.patchtext/plain; charset=UTF-8; name=v2-0006-Push-attidentity-and-attgenerated-handling-into-B.patchDownload+2-3
v2-0007-Move-BuildDescForRelation-from-tupdesc.c-to-table.patchtext/plain; charset=UTF-8; name=v2-0007-Move-BuildDescForRelation-from-tupdesc.c-to-table.patchDownload+105-111
v2-0008-Push-attcompression-and-attstorage-handling-into-.patchtext/plain; charset=UTF-8; name=v2-0008-Push-attcompression-and-attstorage-handling-into-.patchDownload+5-7
v2-0009-Refactor-ATExecAddColumn-to-use-BuildDescForRelat.patchtext/plain; charset=UTF-8; name=v2-0009-Refactor-ATExecAddColumn-to-use-BuildDescForRelat.patchDownload+22-68
On 2023-Aug-29, Peter Eisentraut wrote:
Regarding this hunk in 0002,
@@ -3278,13 +3261,16 @@ MergeAttributes(List *schema, List *supers, char relpersistence, * * constraints is a list of CookedConstraint structs for previous constraints. * - * Returns true if merged (constraint is a duplicate), or false if it's - * got a so-far-unique name, or throws error if conflict. + * If the constraint is a duplicate, then the existing constraint's + * inheritance count is updated. If the constraint doesn't match or conflict + * with an existing one, a new constraint is appended to the list. If there + * is a conflict (same name but different expression), throw an error.
This wording confused me:
"If the constraint doesn't match or conflict with an existing one, a new
constraint is appended to the list."
I first read it as "doesn't match or conflicts with ..." (i.e., the
negation only applied to the first verb, not both) which would have been
surprising (== broken) behavior.
I think it's clearer if you say "doesn't match nor conflict", but I'm
not sure if this is grammatically correct.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On 2023-Aug-29, Peter Eisentraut wrote:
From 471fda80c41fae835ecbe63ae8505526a37487a9 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 12 Jul 2023 16:12:35 +0200
Subject: [PATCH v2 04/10] Add TupleDescGetDefault()This unifies some repetitive code.
Note: I didn't push the "not found" error message into the new
function, even though all existing callers would be able to make use
of it. Using the existing error handling as-is would probably require
exposing the Relation type via tupdesc.h, which doesn't seem
desirable.
Note that all errors here are elog(ERROR), not user-facing, so ISTM
it would be OK to change them to report the relation OID rather than the
name.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"In fact, the basic problem with Perl 5's subroutines is that they're not
crufty enough, so the cruft leaks out into user-defined code instead, by
the Conservation of Cruft Principle." (Larry Wall, Apocalypse 6)
On Tue, Aug 29, 2023 at 10:43:39AM +0200, Peter Eisentraut wrote:
I have committed a few more patches from this series that were already
agreed upon. The remaining ones are rebased and reordered a bit, attached.
My compiler is complaining about 1fa9241b:
../postgresql/src/backend/commands/sequence.c: In function ‘DefineSequence’:
../postgresql/src/backend/commands/sequence.c:196:21: error: ‘coldef’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
196 | stmt->tableElts = lappend(stmt->tableElts, coldef);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This went away when I added a default case that ERROR'd or initialized
coldef to NULL.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On 2023-Aug-29, Nathan Bossart wrote:
On Tue, Aug 29, 2023 at 10:43:39AM +0200, Peter Eisentraut wrote:
I have committed a few more patches from this series that were already
agreed upon. The remaining ones are rebased and reordered a bit, attached.My compiler is complaining about 1fa9241b:
../postgresql/src/backend/commands/sequence.c: In function ‘DefineSequence’:
../postgresql/src/backend/commands/sequence.c:196:21: error: ‘coldef’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
196 | stmt->tableElts = lappend(stmt->tableElts, coldef);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~This went away when I added a default case that ERROR'd or initialized
coldef to NULL.
Makes sense. However, maybe we should replace those ugly defines and
their hardcoded values in DefineSequence with a proper array with their
names and datatypes.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Los trabajadores menos efectivos son sistematicamente llevados al lugar
donde pueden hacer el menor daño posible: gerencia." (El principio Dilbert)
On Tue, Aug 29, 2023 at 08:44:02PM +0200, Alvaro Herrera wrote:
On 2023-Aug-29, Nathan Bossart wrote:
My compiler is complaining about 1fa9241b:
../postgresql/src/backend/commands/sequence.c: In function ‘DefineSequence’:
../postgresql/src/backend/commands/sequence.c:196:21: error: ‘coldef’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
196 | stmt->tableElts = lappend(stmt->tableElts, coldef);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~This went away when I added a default case that ERROR'd or initialized
coldef to NULL.Makes sense. However, maybe we should replace those ugly defines and
their hardcoded values in DefineSequence with a proper array with their
names and datatypes.
That might be an improvement, but IIUC you'd still need to enumerate all of
the columns or data types to make sure you use the right get-Datum
function. It doesn't help that last_value uses Int64GetDatumFast and
log_cnt uses Int64GetDatum. I could be missing something, though.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On 29.08.23 19:45, Nathan Bossart wrote:
On Tue, Aug 29, 2023 at 10:43:39AM +0200, Peter Eisentraut wrote:
I have committed a few more patches from this series that were already
agreed upon. The remaining ones are rebased and reordered a bit, attached.My compiler is complaining about 1fa9241b:
../postgresql/src/backend/commands/sequence.c: In function ‘DefineSequence’:
../postgresql/src/backend/commands/sequence.c:196:21: error: ‘coldef’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
196 | stmt->tableElts = lappend(stmt->tableElts, coldef);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~This went away when I added a default case that ERROR'd or initialized
coldef to NULL.
fixed
On 2023-Aug-29, Nathan Bossart wrote:
On Tue, Aug 29, 2023 at 08:44:02PM +0200, Alvaro Herrera wrote:
Makes sense. However, maybe we should replace those ugly defines and
their hardcoded values in DefineSequence with a proper array with their
names and datatypes.That might be an improvement, but IIUC you'd still need to enumerate all of
the columns or data types to make sure you use the right get-Datum
function. It doesn't help that last_value uses Int64GetDatumFast and
log_cnt uses Int64GetDatum. I could be missing something, though.
Well, for sure I meant to enumerate everything that was needed,
including the initializer for the value. Like in the attached patch.
However, now that I've actually written it, I don't find it so pretty
anymore, but maybe that's just because I don't know how to write the
array assignment as a single statement instead of a separate statement
for each column.
But this should silence the warnings, anyway.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Attachments:
seq-warnings.patchtext/x-diff; charset=utf-8Download+33-32
On 29.08.23 13:20, Alvaro Herrera wrote:
On 2023-Aug-29, Peter Eisentraut wrote:
@@ -3278,13 +3261,16 @@ MergeAttributes(List *schema, List *supers, char relpersistence, * * constraints is a list of CookedConstraint structs for previous constraints. * - * Returns true if merged (constraint is a duplicate), or false if it's - * got a so-far-unique name, or throws error if conflict. + * If the constraint is a duplicate, then the existing constraint's + * inheritance count is updated. If the constraint doesn't match or conflict + * with an existing one, a new constraint is appended to the list. If there + * is a conflict (same name but different expression), throw an error.This wording confused me:
"If the constraint doesn't match or conflict with an existing one, a new
constraint is appended to the list."I first read it as "doesn't match or conflicts with ..." (i.e., the
negation only applied to the first verb, not both) which would have been
surprising (== broken) behavior.I think it's clearer if you say "doesn't match nor conflict", but I'm
not sure if this is grammatically correct.
Here is an updated version of this patch set. I resolved some conflicts
and addressed this comment of yours. I also dropped the one patch with
some catalog header edits that people didn't seem to particularly like.
The patches that are now 0001 through 0004 were previously reviewed and
just held for the not-null constraint patches, I think, so I'll commit
them in a few days if there are no objections.
Patches 0005 through 0007 are also ready in my opinion, but they haven't
really been reviewed, so this would be something for reviewers to focus
on. (0005 and 0007 are trivial, but they go to together with 0006.)
The remaining 0008 and 0009 were still under discussion and contemplation.
Attachments:
v3-0001-Clean-up-MergeAttributesIntoExisting.patchtext/plain; charset=UTF-8; name=v3-0001-Clean-up-MergeAttributesIntoExisting.patchDownload+48-76
v3-0002-Clean-up-MergeCheckConstraint.patchtext/plain; charset=UTF-8; name=v3-0002-Clean-up-MergeCheckConstraint.patchDownload+22-27
v3-0003-MergeAttributes-and-related-variable-renaming.patchtext/plain; charset=UTF-8; name=v3-0003-MergeAttributes-and-related-variable-renaming.patchDownload+59-65
v3-0004-Add-TupleDescGetDefault.patchtext/plain; charset=UTF-8; name=v3-0004-Add-TupleDescGetDefault.patchDownload+32-42
v3-0005-Push-attidentity-and-attgenerated-handling-into-B.patchtext/plain; charset=UTF-8; name=v3-0005-Push-attidentity-and-attgenerated-handling-into-B.patchDownload+2-3
v3-0006-Move-BuildDescForRelation-from-tupdesc.c-to-table.patchtext/plain; charset=UTF-8; name=v3-0006-Move-BuildDescForRelation-from-tupdesc.c-to-table.patchDownload+105-111
v3-0007-Push-attcompression-and-attstorage-handling-into-.patchtext/plain; charset=UTF-8; name=v3-0007-Push-attcompression-and-attstorage-handling-into-.patchDownload+5-7
v3-0008-Refactor-ATExecAddColumn-to-use-BuildDescForRelat.patchtext/plain; charset=UTF-8; name=v3-0008-Refactor-ATExecAddColumn-to-use-BuildDescForRelat.patchDownload+22-68
v3-0009-MergeAttributes-convert-pg_attribute-back-to-Colu.patchtext/plain; charset=UTF-8; name=v3-0009-MergeAttributes-convert-pg_attribute-back-to-Colu.patchDownload+94-88
On 19.09.23 15:11, Peter Eisentraut wrote:
Here is an updated version of this patch set. I resolved some conflicts
and addressed this comment of yours. I also dropped the one patch with
some catalog header edits that people didn't seem to particularly like.The patches that are now 0001 through 0004 were previously reviewed and
just held for the not-null constraint patches, I think, so I'll commit
them in a few days if there are no objections.Patches 0005 through 0007 are also ready in my opinion, but they haven't
really been reviewed, so this would be something for reviewers to focus
on. (0005 and 0007 are trivial, but they go to together with 0006.)The remaining 0008 and 0009 were still under discussion and contemplation.
I have committed through 0007, and I'll now close this patch set as
"Committed", and I will (probably) bring back the rest (especially 0008)
as part of a different patch set soon.
On 05.10.23 17:49, Peter Eisentraut wrote:
On 19.09.23 15:11, Peter Eisentraut wrote:
Here is an updated version of this patch set. I resolved some
conflicts and addressed this comment of yours. I also dropped the one
patch with some catalog header edits that people didn't seem to
particularly like.The patches that are now 0001 through 0004 were previously reviewed
and just held for the not-null constraint patches, I think, so I'll
commit them in a few days if there are no objections.Patches 0005 through 0007 are also ready in my opinion, but they
haven't really been reviewed, so this would be something for reviewers
to focus on. (0005 and 0007 are trivial, but they go to together with
0006.)The remaining 0008 and 0009 were still under discussion and
contemplation.I have committed through 0007, and I'll now close this patch set as
"Committed", and I will (probably) bring back the rest (especially 0008)
as part of a different patch set soon.
After playing with this for, well, 2 months, and considering various
other approaches, I would like to bring back the remaining two patches
in unchanged form.
Especially the (now) first patch "Refactor ATExecAddColumn() to use
BuildDescForRelation()" would be very helpful for facilitating further
refactoring in this area, because it avoids having two essentially
duplicate pieces of code responsible for converting a ColumnDef node
into internal form.
One of your (Álvaro's) comments about this earlier was
Hmm, crazy. I'm not sure I like this, because it seems much too clever.
The number of lines that are deleted is alluring, though.Maybe it'd be better to create a separate routine that takes a single
ColumnDef and returns the Form_pg_attribute element for it; then use
that in both BuildDescForRelation and ATExecAddColumn.
which was also my thought at the beginning. However, this wouldn't
quite work that way, for several reasons. For instance,
BuildDescForRelation() also needs to keep track of the has_not_null
property across all fields, and so if you split that function up, you
would have to somehow make that an output argument and have the caller
keep track of it. Also, the output of BuildDescForRelation() in
ATExecAddColumn() is passed into InsertPgAttributeTuples(), which
requires a tuple descriptor anyway, so splitting this up into a
per-attribute function would then require ATExecAddColumn() to
re-assemble a tuple descriptor anyway, so this wouldn't save anything.
Also note that ATExecAddColumn() could in theory be enhanced to add more
than one column in one go, so having this code structure in place isn't
inconsistent with that.
The main hackish thing, I suppose, is that we have to fix up the
attribute number after returning from BuildDescForRelation(). I suppose
we could address that by passing in a starting attribute number (or
alternatively maximum existing attribute number) into
BuildDescForRelation(). I think that would be okay; it would probably
be about a wash in terms of code added versus saved.
The (now) second patch is also still of interest to me, but I have since
noticed that I think [0]/messages/by-id/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7@eisentraut.org should be fixed first, but to make that fix
simpler, I would like the first patch from here.
[0]: /messages/by-id/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7@eisentraut.org
/messages/by-id/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7@eisentraut.org
Attachments:
v4-0001-Refactor-ATExecAddColumn-to-use-BuildDescForRelat.patchtext/plain; charset=UTF-8; name=v4-0001-Refactor-ATExecAddColumn-to-use-BuildDescForRelat.patchDownload+22-68
v4-0002-MergeAttributes-convert-pg_attribute-back-to-Colu.patchtext/plain; charset=UTF-8; name=v4-0002-MergeAttributes-convert-pg_attribute-back-to-Colu.patchDownload+94-88
On 2023-Dec-06, Peter Eisentraut wrote:
One of your (Álvaro's) comments about this earlier was
Hmm, crazy. I'm not sure I like this, because it seems much too clever.
The number of lines that are deleted is alluring, though.Maybe it'd be better to create a separate routine that takes a single
ColumnDef and returns the Form_pg_attribute element for it; then use
that in both BuildDescForRelation and ATExecAddColumn.which was also my thought at the beginning. However, this wouldn't quite
work that way, for several reasons. For instance, BuildDescForRelation()
also needs to keep track of the has_not_null property across all fields, and
so if you split that function up, you would have to somehow make that an
output argument and have the caller keep track of it. Also, the output of
BuildDescForRelation() in ATExecAddColumn() is passed into
InsertPgAttributeTuples(), which requires a tuple descriptor anyway, so
splitting this up into a per-attribute function would then require
ATExecAddColumn() to re-assemble a tuple descriptor anyway, so this wouldn't
save anything.
Hmm. Well, if this state of affairs is useful to you, then I withdraw
my objection, because with this patch we're not really adding any new
weirdness, just moving around already-existing weirdness. So let's
press ahead with 0001. (I didn't look at 0002 this time, since
apparently you'd like to process the other patch first and then come
back here.)
If you look closely at InsertPgAttributeTuples and accompanying code, it
all looks a bit archaic. They seem to be treating TupleDesc as a
glorified array of Form_pg_attribute elements in a convenient packaging.
It's probably cleaner to change these APIs so that they deal with a
Form_pg_attribute array, and not TupleDesc anymore. But we can hack on
that some other day.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Pensar que el espectro que vemos es ilusorio no lo despoja de espanto,
sólo le suma el nuevo terror de la locura" (Perelandra, C.S. Lewis)
On 2024-Jan-11, Alvaro Herrera wrote:
If you look closely at InsertPgAttributeTuples and accompanying code, it
all looks a bit archaic. They seem to be treating TupleDesc as a
glorified array of Form_pg_attribute elements in a convenient packaging.
It's probably cleaner to change these APIs so that they deal with a
Form_pg_attribute array, and not TupleDesc anymore. But we can hack on
that some other day.
In addition, it also occurs to me now that maybe it would make sense to
change the TupleDesc implementation to use a List of Form_pg_attribute
instead of an array, and do away with ->natts. This would let us change
all "for ( ... natts ...)" loops into foreach_ptr() loops ... there are
only five hundred of them or so --rolls eyes--.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"El sudor es la mejor cura para un pensamiento enfermo" (Bardia)
On Fri, Jan 12, 2024 at 5:32 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2024-Jan-11, Alvaro Herrera wrote:
If you look closely at InsertPgAttributeTuples and accompanying code, it
all looks a bit archaic. They seem to be treating TupleDesc as a
glorified array of Form_pg_attribute elements in a convenient packaging.
It's probably cleaner to change these APIs so that they deal with a
Form_pg_attribute array, and not TupleDesc anymore. But we can hack on
that some other day.In addition, it also occurs to me now that maybe it would make sense to
change the TupleDesc implementation to use a List of Form_pg_attribute
instead of an array, and do away with ->natts. This would let us change
all "for ( ... natts ...)" loops into foreach_ptr() loops ... there are
only five hundred of them or so --rolls eyes--.
Open-coding stuff like this can easily work out to a loss, and I
personally think we're overly dependent on List. It's not a
particularly good abstraction, IMHO, and if we do a lot of work to
start using it everywhere because a list is really an array, then what
happens when somebody decides that a list really ought to be a
skip-list, or a hash table, or some other crazy thing?
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Fri, Jan 12, 2024 at 5:32 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
In addition, it also occurs to me now that maybe it would make sense to
change the TupleDesc implementation to use a List of Form_pg_attribute
instead of an array, and do away with ->natts. This would let us change
all "for ( ... natts ...)" loops into foreach_ptr() loops ... there are
only five hundred of them or so --rolls eyes--.
Open-coding stuff like this can easily work out to a loss, and I
personally think we're overly dependent on List. It's not a
particularly good abstraction, IMHO, and if we do a lot of work to
start using it everywhere because a list is really an array, then what
happens when somebody decides that a list really ought to be a
skip-list, or a hash table, or some other crazy thing?
Without getting into opinions on whether List is a good abstraction,
I'm -1 on this idea. It would be a large amount of code churn, with
attendant back-patching pain, and I just don't see that there is
much to be gained.
regards, tom lane
On Fri, Jan 12, 2024 at 10:09 AM Robert Haas <robertmhaas@gmail.com> wrote:
Open-coding stuff like this can easily work out to a loss, and I
personally think we're overly dependent on List. It's not a
particularly good abstraction, IMHO, and if we do a lot of work to
start using it everywhere because a list is really an array, then what
happens when somebody decides that a list really ought to be a
skip-list, or a hash table, or some other crazy thing?
This paragraph was a bit garbled. I meant to say that open-coding can
be better than relying on a canned abstraction, but it came out the
other way around.
--
Robert Haas
EDB: http://www.enterprisedb.com