find_inheritance_children() and ALTER TABLE NO INHERIT

Started by Amit Langoteabout 10 years ago4 messages
#1Amit Langote
Langote_Amit_f8@lab.ntt.co.jp

Currently find_inheritance_children() is smart enough to skip a child
table that it finds has been dropped concurrently after it gets a lock on
the same. It does so by looking up the child relid in syscache. It seems
it should also check if the table is still in the list of children of the
parent. Doing so by scanning the pg_inherits(inhparent) index may likely
be inefficient. So, how about adding that syscache on
pg_inherits(inherelid, inhparent) [1]/messages/by-id/25515.1149652014@sss.pgh.pa.us?

I was prompted by a report sometime ago on -general [2]/messages/by-id/55BA1A06.1000100@gmail.com about the "could
not find inherited attribute..." error. Also, it was reported as a bug few
years ago where locking parent exclusively in ALTER TABLE NO INHERIT as a
solution was dismissed for being prone to deadlock issues [3]/messages/by-id/19666.1213709303@sss.pgh.pa.us.

Would it be worth the trouble?

Thanks,
Amit

[1]: /messages/by-id/25515.1149652014@sss.pgh.pa.us
[2]: /messages/by-id/55BA1A06.1000100@gmail.com
[3]: /messages/by-id/19666.1213709303@sss.pgh.pa.us

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#1)
Re: find_inheritance_children() and ALTER TABLE NO INHERIT

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

Currently find_inheritance_children() is smart enough to skip a child
table that it finds has been dropped concurrently after it gets a lock on
the same. It does so by looking up the child relid in syscache. It seems
it should also check if the table is still in the list of children of the
parent. Doing so by scanning the pg_inherits(inhparent) index may likely
be inefficient. So, how about adding that syscache on
pg_inherits(inherelid, inhparent) [1]?

I doubt that a syscache would fix the performance issue there; it wouldn't
get referenced enough to be likely to have the desired tuple in cache.

I wonder whether we could improve matters by rechecking validity of the
pg_inherits tuple (which we saw already and could presumably retain the
TID of). There is at least one place where we do something like that now,
IIRC.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#2)
Re: find_inheritance_children() and ALTER TABLE NO INHERIT

On 2015/12/03 13:09, Tom Lane wrote:

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

Currently find_inheritance_children() is smart enough to skip a child
table that it finds has been dropped concurrently after it gets a lock on
the same. It does so by looking up the child relid in syscache. It seems
it should also check if the table is still in the list of children of the
parent. Doing so by scanning the pg_inherits(inhparent) index may likely
be inefficient. So, how about adding that syscache on
pg_inherits(inherelid, inhparent) [1]?

I doubt that a syscache would fix the performance issue there; it wouldn't
get referenced enough to be likely to have the desired tuple in cache.

Ah, right.

I wonder whether we could improve matters by rechecking validity of the
pg_inherits tuple (which we saw already and could presumably retain the
TID of). There is at least one place where we do something like that now,
IIRC.

Given that the generation of child OID list and locking of child tables
occur independently, do you mean to collect catalog tuple TIDs along with
corresponding OIDs during the catalog scan and recheck them during the
locking step?

Not sure whether sane but how about performing ordered scan on pg_inherits
(systable_getnext_ordered())and using systable_recheck_tuple() in step
with it? Does using ordered catalog scan ensure safety against deadlocks
that the existing approach of ordered locking of child tables does?
Perhaps I'm missing something.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#3)
1 attachment(s)
Re: find_inheritance_children() and ALTER TABLE NO INHERIT

On 2015/12/03 15:30, Amit Langote wrote:

On 2015/12/03 13:09, Tom Lane wrote:

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

Currently find_inheritance_children() is smart enough to skip a child
table that it finds has been dropped concurrently after it gets a lock on
the same. It does so by looking up the child relid in syscache. It seems
it should also check if the table is still in the list of children of the
parent.

I wonder whether we could improve matters by rechecking validity of the
pg_inherits tuple (which we saw already and could presumably retain the
TID of). There is at least one place where we do something like that now,
IIRC.

Not sure whether sane but how about performing ordered scan on pg_inherits
(systable_getnext_ordered())and using systable_recheck_tuple() in step
with it? Does using ordered catalog scan ensure safety against deadlocks
that the existing approach of ordered locking of child tables does?
Perhaps I'm missing something.

Just leaving here a patch that does this. It still returns the list in
order determined by qsort(), IOW, not in the pg_inherits_parent_index
order to avoid broken tests. I could not figure how to do it the way you
suggested.

Thanks,
Amit

Attachments:

find-inh-children-surgery-1.patchtext/x-diff; name=find-inh-children-surgery-1.patchDownload
diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c
index 04687c1..bfef40d 100644
--- a/src/backend/catalog/pg_inherits.c
+++ b/src/backend/catalog/pg_inherits.c
@@ -50,9 +50,10 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
 {
 	List	   *list = NIL;
 	Relation	relation;
+	Relation	index;
 	SysScanDesc scan;
 	ScanKeyData key[1];
-	HeapTuple	inheritsTuple;
+	HeapTuple	tup;
 	Oid			inhrelid;
 	Oid		   *oidarr;
 	int			maxoids,
@@ -74,46 +75,30 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
 	numoids = 0;
 
 	relation = heap_open(InheritsRelationId, AccessShareLock);
+	index = index_open(InheritsParentIndexId, AccessShareLock);
 
 	ScanKeyInit(&key[0],
 				Anum_pg_inherits_inhparent,
 				BTEqualStrategyNumber, F_OIDEQ,
 				ObjectIdGetDatum(parentrelId));
 
-	scan = systable_beginscan(relation, InheritsParentIndexId, true,
-							  NULL, 1, key);
+	scan = systable_beginscan_ordered(relation, index, NULL, 1, key);
 
-	while ((inheritsTuple = systable_getnext(scan)) != NULL)
+	while ((tup = systable_getnext_ordered(scan, ForwardScanDirection)) != NULL)
 	{
-		inhrelid = ((Form_pg_inherits) GETSTRUCT(inheritsTuple))->inhrelid;
+		inhrelid = ((Form_pg_inherits) GETSTRUCT(tup))->inhrelid;
 		if (numoids >= maxoids)
 		{
 			maxoids *= 2;
 			oidarr = (Oid *) repalloc(oidarr, maxoids * sizeof(Oid));
 		}
-		oidarr[numoids++] = inhrelid;
-	}
-
-	systable_endscan(scan);
-
-	heap_close(relation, AccessShareLock);
-
-	/*
-	 * If we found more than one child, sort them by OID.  This ensures
-	 * reasonably consistent behavior regardless of the vagaries of an
-	 * indexscan.  This is important since we need to be sure all backends
-	 * lock children in the same order to avoid needless deadlocks.
-	 */
-	if (numoids > 1)
-		qsort(oidarr, numoids, sizeof(Oid), oid_cmp);
-
-	/*
-	 * Acquire locks and build the result list.
-	 */
-	for (i = 0; i < numoids; i++)
-	{
-		inhrelid = oidarr[i];
 
+		/*
+		 * Following code assumes that inhrelids are returned in the same
+		 * order by systable_getnext_ordered() in all backends. This is
+		 * important since we need to be sure all backends lock children
+		 * in the same order to avoid needless deadlocks.
+		 */
 		if (lockmode != NoLock)
 		{
 			/* Get the lock to synchronize against concurrent drop */
@@ -124,7 +109,8 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
 			 * really exists or not.  If not, assume it was dropped while we
 			 * waited to acquire lock, and ignore it.
 			 */
-			if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(inhrelid)))
+			if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(inhrelid)) ||
+				!systable_recheck_tuple(scan, tup))
 			{
 				/* Release useless lock */
 				UnlockRelationOid(inhrelid, lockmode);
@@ -133,10 +119,22 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
 			}
 		}
 
-		list = lappend_oid(list, inhrelid);
+		oidarr[numoids++] = inhrelid;
 	}
 
-	pfree(oidarr);
+	/*
+	 * Do not break regression tests - return the oids in the same order as
+	 * would have been previously.
+	 */
+	if (numoids > 1)
+		qsort(oidarr, numoids, sizeof(Oid), oid_cmp);
+
+	for (i = 0; i < numoids; i++)
+		list = lappend_oid(list, oidarr[i]);
+
+	systable_endscan_ordered(scan);
+	index_close(index, AccessShareLock);
+	heap_close(relation, AccessShareLock);
 
 	return list;
 }