Set all variable-length fields of pg_attribute to null on column drop

Started by Peter Eisentrautabout 2 years ago4 messages
#1Peter Eisentraut
peter@eisentraut.org
1 attachment(s)

I noticed that when a column is dropped, RemoveAttributeById() clears
out certain fields in pg_attribute, but it leaves the variable-length
fields at the end (attacl, attoptions, and attfdwoptions) unchanged.
This is probably harmless, but it seems wasteful and unclean, and leaves
potentially dangling data lying around (for example, attacl could
contain references to users that are later also dropped).

I suggest the attached patch to set those fields to null when a column
is marked as dropped.

Attachments:

v1-0001-Set-all-variable-length-fields-of-pg_attribute-to.patchtext/plain; charset=UTF-8; name=v1-0001-Set-all-variable-length-fields-of-pg_attribute-to.patchDownload
From d587588df6d2bea8ea9b9e13b897e4206b1a0884 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 30 Nov 2023 12:19:20 +0100
Subject: [PATCH v1] Set all variable-length fields of pg_attribute to null on
 column drop

When a column is dropped, the fields attacl, attoptions, and
attfdwoptions were kept unchanged.  This is probably harmless, but it
seems wasteful, and leaves potentially dangling data lying around (for
example, attacl could contain references to users that are later also
dropped).

Change this to set those fields to null when a column is marked as
dropped.
---
 src/backend/catalog/heap.c | 39 ++++++++++++++++++++------------------
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 7224d96695..b93894889d 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -1647,6 +1647,9 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
 	HeapTuple	tuple;
 	Form_pg_attribute attStruct;
 	char		newattname[NAMEDATALEN];
+	Datum		valuesAtt[Natts_pg_attribute] = {0};
+	bool		nullsAtt[Natts_pg_attribute] = {0};
+	bool		replacesAtt[Natts_pg_attribute] = {0};
 
 	/*
 	 * Grab an exclusive lock on the target table, which we will NOT release
@@ -1695,24 +1698,24 @@ RemoveAttributeById(Oid relid, AttrNumber attnum)
 			 "........pg.dropped.%d........", attnum);
 	namestrcpy(&(attStruct->attname), newattname);
 
-	/* clear the missing value if any */
-	if (attStruct->atthasmissing)
-	{
-		Datum		valuesAtt[Natts_pg_attribute] = {0};
-		bool		nullsAtt[Natts_pg_attribute] = {0};
-		bool		replacesAtt[Natts_pg_attribute] = {0};
-
-		/* update the tuple - set atthasmissing and attmissingval */
-		valuesAtt[Anum_pg_attribute_atthasmissing - 1] =
-			BoolGetDatum(false);
-		replacesAtt[Anum_pg_attribute_atthasmissing - 1] = true;
-		valuesAtt[Anum_pg_attribute_attmissingval - 1] = (Datum) 0;
-		nullsAtt[Anum_pg_attribute_attmissingval - 1] = true;
-		replacesAtt[Anum_pg_attribute_attmissingval - 1] = true;
-
-		tuple = heap_modify_tuple(tuple, RelationGetDescr(attr_rel),
-								  valuesAtt, nullsAtt, replacesAtt);
-	}
+	/* Clear the missing value */
+	attStruct->atthasmissing = false;
+	nullsAtt[Anum_pg_attribute_attmissingval - 1] = true;
+	replacesAtt[Anum_pg_attribute_attmissingval - 1] = true;
+
+	/*
+	 * Clear the other variable-length fields.  This saves some space in
+	 * pg_attribute and removes no longer useful information.
+	 */
+	nullsAtt[Anum_pg_attribute_attacl - 1] = true;
+	replacesAtt[Anum_pg_attribute_attacl - 1] = true;
+	nullsAtt[Anum_pg_attribute_attoptions - 1] = true;
+	replacesAtt[Anum_pg_attribute_attoptions - 1] = true;
+	nullsAtt[Anum_pg_attribute_attfdwoptions - 1] = true;
+	replacesAtt[Anum_pg_attribute_attfdwoptions - 1] = true;
+
+	tuple = heap_modify_tuple(tuple, RelationGetDescr(attr_rel),
+							  valuesAtt, nullsAtt, replacesAtt);
 
 	CatalogTupleUpdate(attr_rel, &tuple->t_self, tuple);
 
-- 
2.43.0

#2Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#1)
Re: Set all variable-length fields of pg_attribute to null on column drop

On Thu, Nov 30, 2023 at 6:24 AM Peter Eisentraut <peter@eisentraut.org> wrote:

I noticed that when a column is dropped, RemoveAttributeById() clears
out certain fields in pg_attribute, but it leaves the variable-length
fields at the end (attacl, attoptions, and attfdwoptions) unchanged.
This is probably harmless, but it seems wasteful and unclean, and leaves
potentially dangling data lying around (for example, attacl could
contain references to users that are later also dropped).

I suggest the attached patch to set those fields to null when a column
is marked as dropped.

I haven't reviewed the patch, but +1 for the idea.

--
Robert Haas
EDB: http://www.enterprisedb.com

#3Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Peter Eisentraut (#1)
Re: Set all variable-length fields of pg_attribute to null on column drop

On 2023-Nov-30, Peter Eisentraut wrote:

I noticed that when a column is dropped, RemoveAttributeById() clears out
certain fields in pg_attribute, but it leaves the variable-length fields at
the end (attacl, attoptions, and attfdwoptions) unchanged. This is probably
harmless, but it seems wasteful and unclean, and leaves potentially dangling
data lying around (for example, attacl could contain references to users
that are later also dropped).

Yeah, this looks like an ancient oversight -- when DROP COLUMN was added
we didn't have any varlena fields in this catalog, and when the first
one was added (attacl in commit 3cb5d6580a33) resetting it on DROP
COLUMN was overlooked.

LGTM.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"No hay ausente sin culpa ni presente sin disculpa" (Prov. francés)

#4Peter Eisentraut
peter@eisentraut.org
In reply to: Alvaro Herrera (#3)
Re: Set all variable-length fields of pg_attribute to null on column drop

On 22.12.23 10:05, Alvaro Herrera wrote:

On 2023-Nov-30, Peter Eisentraut wrote:

I noticed that when a column is dropped, RemoveAttributeById() clears out
certain fields in pg_attribute, but it leaves the variable-length fields at
the end (attacl, attoptions, and attfdwoptions) unchanged. This is probably
harmless, but it seems wasteful and unclean, and leaves potentially dangling
data lying around (for example, attacl could contain references to users
that are later also dropped).

Yeah, this looks like an ancient oversight -- when DROP COLUMN was added
we didn't have any varlena fields in this catalog, and when the first
one was added (attacl in commit 3cb5d6580a33) resetting it on DROP
COLUMN was overlooked.

LGTM.

committed