Isn't partition drop code seriously at risk of deadlock?
The complaint in bug #14927 that heap_drop_with_catalog is not bothering
to check for SearchSysCache lookup failure (in code evidently newly added
for the partition feature) seems to me to be only scratching the surface
of what's wrong with that code. In particular, I do not understand how
it can possibly be deadlock-free to be trying to grab AccessExclusiveLock
on a partition's parent table when we already have such a lock on the
partition. Which we do, or at least had better, long before we get to
heap_drop_with_catalog.
regards, tom lane
On 2017/11/28 9:04, Tom Lane wrote:
The complaint in bug #14927 that heap_drop_with_catalog is not bothering
to check for SearchSysCache lookup failure (in code evidently newly added
for the partition feature) seems to me to be only scratching the surface
of what's wrong with that code. In particular, I do not understand how
it can possibly be deadlock-free to be trying to grab AccessExclusiveLock
on a partition's parent table when we already have such a lock on the
partition. Which we do, or at least had better, long before we get to
heap_drop_with_catalog.
We do that as of 258cef12540fa1 [1]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=258cef12540fa1. The lock on the parent is taken in
RangeVarCallbackForDropRelation() before the partition itself is locked.
Thanks,
Amit
[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=258cef12540fa1
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=258cef12540fa1