Small patch to fix an O(N^2) problem in foreign keys
Attached is a small patch and a script to reproduce the issue.
The problem is a cache introduced in commit 45ba4247 that improves
foreign key lookups during bulk updates when the FK value does not
change. When restoring a schema dump from a database with many (say
100,000) foreign keys, this cache is growing very big and every ALTER
TABLE command is causing a InvalidateConstraintCacheCallBack(), which
does a sequential hash table scan.
The patch uses a heuristic method of detecting when the hash table
should be destroyed and recreated. InvalidateConstraintCacheCallBack()
adds the current size of the hash table to a counter. When that sum
reaches 1,000,000, the hash table is flushed. This improves the schema
restore of a database with 100,000 foreign keys by factor 3.
According to my tests the patch does not interfere with the bulk
updates, the original feature was supposed to improve.
Regards, Jan
--
Jan Wieck
Senior Software Engineer
http://slony.info
Attachments:
pg96_ri_constraint_cache_flush.patchtext/x-patch; name=pg96_ri_constraint_cache_flush.patchDownload
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 61edde9..d7023ce 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -183,6 +183,7 @@ typedef struct RI_CompareHashEntry
* ----------
*/
static HTAB *ri_constraint_cache = NULL;
+static long ri_constraint_cache_seq_count = 0;
static HTAB *ri_query_cache = NULL;
static HTAB *ri_compare_cache = NULL;
@@ -2945,6 +2946,27 @@ InvalidateConstraintCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
Assert(ri_constraint_cache != NULL);
+ /*
+ * Prevent an O(N^2) problem when creating large amounts of foreign
+ * key constraints with ALTER TABLE, like it happens at the end of
+ * a pg_dump with hundred-thousands of tables having references.
+ */
+ ri_constraint_cache_seq_count += hash_get_num_entries(ri_constraint_cache);
+ if (ri_constraint_cache_seq_count > 1000000)
+ {
+ HASHCTL ctl;
+
+ hash_destroy(ri_constraint_cache);
+ memset(&ctl, 0, sizeof(ctl));
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RI_ConstraintInfo);
+ ri_constraint_cache = hash_create("RI constraint cache",
+ RI_INIT_CONSTRAINTHASHSIZE,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+ ri_constraint_cache_seq_count = 0;
+ return;
+ }
+
hash_seq_init(&status, ri_constraint_cache);
while ((hentry = (RI_ConstraintInfo *) hash_seq_search(&status)) != NULL)
{
Jan Wieck <jan@wi3ck.info> wrote:
The problem is a cache introduced in commit 45ba4247 that improves
That's a bit off; 45ba424f seems to be what you mean.
foreign key lookups during bulk updates when the FK value does not
change. When restoring a schema dump from a database with many (say
100,000) foreign keys, this cache is growing very big and every ALTER
TABLE command is causing a InvalidateConstraintCacheCallBack(), which
does a sequential hash table scan.The patch uses a heuristic method of detecting when the hash table
should be destroyed and recreated. InvalidateConstraintCacheCallBack()
adds the current size of the hash table to a counter. When that sum
reaches 1,000,000, the hash table is flushed. This improves the schema
restore of a database with 100,000 foreign keys by factor 3.According to my tests the patch does not interfere with the bulk
updates, the original feature was supposed to improve.
In the real-world customer case that caused you to look into this,
I thought 45ba424f drove schema-only restore time from 2 hours to
about 25 hours, and that this patch takes it back down to 2 hours.
Am I remembering right? And this came about because it added
20-some hours to a pg_upgrade run?
If there are no objections, I will push this as a bug fix back to
9.3, where the performance regression was introduced.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 09/08/2015 08:49 AM, Kevin Grittner wrote:
Jan Wieck <jan@wi3ck.info> wrote:
The problem is a cache introduced in commit 45ba4247 that improves
That's a bit off; 45ba424f seems to be what you mean.
Copy paste over paper.
foreign key lookups during bulk updates when the FK value does not
change. When restoring a schema dump from a database with many (say
100,000) foreign keys, this cache is growing very big and every ALTER
TABLE command is causing a InvalidateConstraintCacheCallBack(), which
does a sequential hash table scan.The patch uses a heuristic method of detecting when the hash table
should be destroyed and recreated. InvalidateConstraintCacheCallBack()
adds the current size of the hash table to a counter. When that sum
reaches 1,000,000, the hash table is flushed. This improves the schema
restore of a database with 100,000 foreign keys by factor 3.According to my tests the patch does not interfere with the bulk
updates, the original feature was supposed to improve.In the real-world customer case that caused you to look into this,
I thought 45ba424f drove schema-only restore time from 2 hours to
about 25 hours, and that this patch takes it back down to 2 hours.
Am I remembering right? And this came about because it added
20-some hours to a pg_upgrade run?
From 2 hours to >50, but yes, this is that case.
If there are no objections, I will push this as a bug fix back to
9.3, where the performance regression was introduced.
Regards, Jan
--
Jan Wieck
Senior Software Engineer
http://slony.info
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jan Wieck <jan@wi3ck.info> wrote:
The patch uses a heuristic method of detecting when the hash table
should be destroyed and recreated. InvalidateConstraintCacheCallBack()
adds the current size of the hash table to a counter. When that sum
reaches 1,000,000, the hash table is flushed. This improves the schema
restore of a database with 100,000 foreign keys by factor 3.According to my tests the patch does not interfere with the bulk
updates, the original feature was supposed to improve.In the real-world customer case that caused you to look into this,
I thought 45ba424f drove schema-only restore time from 2 hours to
about 25 hours, and that this patch takes it back down to 2 hours.
Am I remembering right? And this came about because it added
20-some hours to a pg_upgrade run?From 2 hours to >50, but yes, this is that case.
If there are no objections, I will push this as a bug fix back to
9.3, where the performance regression was introduced.
Hearing no objections, pushed with one minor tweak Jan and I
discussed off-list -- the original patch duplicated a bit of code
that we agreed should be split into a static function and called
from both places, to protect against someone later changing one and
missing the other.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers