Speed up collation cache
The blog post here (thank you depesz!):
showed an interesting result where the builtin provider is not quite as
fast as "C" for queries like:
SELECT * FROM a WHERE t = '...';
The reason is that it's calling varstr_cmp() many times, which does a
lookup in the collation cache for each call. For sorts, it only does a
lookup in the collation cache once, so the effect is not significant.
The reason looking up "C" is faster is because there's a special check
for C_COLLATION_OID, so it doesn't even need to do the hash lookup. If
you create an equivalent collation like:
CREATE COLLATION libc_c(PROVIDER = libc, LOCALE = 'C');
it will perform the same as a collation with the builtin provider.
Attached is a patch to use simplehash.h instead, which speeds things up
enough to make them fairly close (from around 15% slower to around 8%).
The patch is based on the series here:
/messages/by-id/f1935bc481438c9d86c2e0ac537b1c110d41a00a.camel@j-davis.com
which does some refactoring in a related area, but I can make them
independent.
We can also consider what to do about those special cases:
* add a special case for PG_C_UTF8?
* instead of a hardwired set of special collation IDs, have a single-
element "last collation ID" to check before doing the hash lookup?
* remove the special cases entirely if we can close the performance
gap enough that it's not important?
(Note: the special case in lc_ctpye_is_c() is currently required for
correctness because hba.c uses C_COLLATION_OID for regexes before the
syscache is initialized. That can be fixed pretty easily a couple
different ways, though.)
--
Jeff Davis
PostgreSQL Contributor Team - AWS
Attachments:
v2-0007-Change-collation-cache-to-use-simplehash.h.patchtext/x-patch; charset=UTF-8; name=v2-0007-Change-collation-cache-to-use-simplehash.h.patchDownload
From 777186a41955da8d05929f5c34e531dfa985b513 Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Fri, 14 Jun 2024 15:38:42 -0700
Subject: [PATCH v2 7/7] Change collation cache to use simplehash.h.
---
src/backend/utils/adt/pg_locale.c | 39 +++++++++++++++++++++----------
1 file changed, 27 insertions(+), 12 deletions(-)
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 435a37a0e3..b71ca2d780 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -57,12 +57,12 @@
#include "access/htup_details.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
+#include "common/hashfn.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "utils/builtins.h"
#include "utils/formatting.h"
#include "utils/guc_hooks.h"
-#include "utils/hsearch.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/pg_locale.h"
@@ -129,10 +129,27 @@ typedef struct
{
Oid collid; /* hash key: pg_collation OID */
pg_locale_t locale; /* locale_t struct, or 0 if not valid */
-} collation_cache_entry;
-static HTAB *collation_cache = NULL;
+ /* needed for simplehash */
+ uint32 hash;
+ char status;
+} collation_cache_entry;
+#define SH_PREFIX collation_cache
+#define SH_ELEMENT_TYPE collation_cache_entry
+#define SH_KEY_TYPE Oid
+#define SH_KEY collid
+#define SH_HASH_KEY(tb, key) hash_uint32((uint32) key)
+#define SH_EQUAL(tb, a, b) (a == b)
+#define SH_GET_HASH(tb, a) a->hash
+#define SH_SCOPE static inline
+#define SH_STORE_HASH
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+static MemoryContext CollationCacheContext = NULL;
+static collation_cache_hash *CollationCache = NULL;
#if defined(WIN32) && defined(LC_MESSAGES)
static char *IsoLocaleName(const char *);
@@ -1235,18 +1252,16 @@ lookup_collation_cache(Oid collation)
Assert(OidIsValid(collation));
Assert(collation != DEFAULT_COLLATION_OID);
- if (collation_cache == NULL)
+ if (CollationCache == NULL)
{
- /* First time through, initialize the hash table */
- HASHCTL ctl;
-
- ctl.keysize = sizeof(Oid);
- ctl.entrysize = sizeof(collation_cache_entry);
- collation_cache = hash_create("Collation cache", 100, &ctl,
- HASH_ELEM | HASH_BLOBS);
+ CollationCacheContext = AllocSetContextCreate(TopMemoryContext,
+ "collation cache",
+ ALLOCSET_DEFAULT_SIZES);
+ CollationCache = collation_cache_create(
+ CollationCacheContext, 128, NULL);
}
- cache_entry = hash_search(collation_cache, &collation, HASH_ENTER, &found);
+ cache_entry = collation_cache_insert(CollationCache, collation, &found);
if (!found)
{
/*
--
2.34.1
On 15.06.24 01:46, Jeff Davis wrote:
* instead of a hardwired set of special collation IDs, have a single-
element "last collation ID" to check before doing the hash lookup?
I'd imagine that method could be very effective.
On Sat, Jun 15, 2024 at 6:46 AM Jeff Davis <pgsql@j-davis.com> wrote:
Attached is a patch to use simplehash.h instead, which speeds things up
enough to make them fairly close (from around 15% slower to around 8%).
+#define SH_HASH_KEY(tb, key) hash_uint32((uint32) key)
For a static inline hash for speed reasons, we can use murmurhash32
here, which is also inline.
On Thu, 2024-06-20 at 17:07 +0700, John Naylor wrote:
On Sat, Jun 15, 2024 at 6:46 AM Jeff Davis <pgsql@j-davis.com> wrote:
Attached is a patch to use simplehash.h instead, which speeds
things up
enough to make them fairly close (from around 15% slower to around
8%).+#define SH_HASH_KEY(tb, key) hash_uint32((uint32) key)
For a static inline hash for speed reasons, we can use murmurhash32
here, which is also inline.
Thank you, that brings it down a few more percentage points.
New patches attached, still based on the setlocale-removal patch
series.
Setup:
create collation libc_c (provider=libc, locale='C');
create table collation_cache_test(t text);
insert into collation_cache_test
select g::text||' '||g::text
from generate_series(1,200000000) g;
Queries:
select * from collation_cache_test where t < '0' collate "C";
select * from collation_cache_test where t < '0' collate libc_c;
The two collations are identical except that the former benefits from
the optimization for C_COLLATION_OID, and the latter does not, so these
queries measure the overhead of the collation cache lookup.
Results (in ms):
"C" "libc_c" overhead
master: 6350 7855 24%
v4-0001: 6091 6324 4%
(Note: I don't have an explanation for the difference in performance of
the "C" locale -- probably just some noise in the test.)
Considering that simplehash brings the worst case overhead under 5%, I
don't see a big reason to use the single-element cache also.
Regards,
Jeff Davis
Attachments:
v4-0006-Change-collation-cache-to-use-simplehash.h.patchtext/x-patch; charset=UTF-8; name=v4-0006-Change-collation-cache-to-use-simplehash.h.patchDownload
From 64a017f169858cf646002a28f97ae05cb7ab9fcd Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Fri, 14 Jun 2024 15:38:42 -0700
Subject: [PATCH v4] Change collation cache to use simplehash.h.
---
src/backend/utils/adt/pg_locale.c | 39 +++++++++++++++++++++----------
1 file changed, 27 insertions(+), 12 deletions(-)
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 2e6f624798f..5afb69c6632 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -57,12 +57,12 @@
#include "access/htup_details.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_database.h"
+#include "common/hashfn.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "utils/builtins.h"
#include "utils/formatting.h"
#include "utils/guc_hooks.h"
-#include "utils/hsearch.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/pg_locale.h"
@@ -129,10 +129,27 @@ typedef struct
{
Oid collid; /* hash key: pg_collation OID */
pg_locale_t locale; /* locale_t struct, or 0 if not valid */
-} collation_cache_entry;
-static HTAB *collation_cache = NULL;
+ /* needed for simplehash */
+ uint32 hash;
+ char status;
+} collation_cache_entry;
+#define SH_PREFIX collation_cache
+#define SH_ELEMENT_TYPE collation_cache_entry
+#define SH_KEY_TYPE Oid
+#define SH_KEY collid
+#define SH_HASH_KEY(tb, key) murmurhash32((uint32) key)
+#define SH_EQUAL(tb, a, b) (a == b)
+#define SH_GET_HASH(tb, a) a->hash
+#define SH_SCOPE static inline
+#define SH_STORE_HASH
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+static MemoryContext CollationCacheContext = NULL;
+static collation_cache_hash *CollationCache = NULL;
#if defined(WIN32) && defined(LC_MESSAGES)
static char *IsoLocaleName(const char *);
@@ -1219,18 +1236,16 @@ lookup_collation_cache(Oid collation)
Assert(OidIsValid(collation));
Assert(collation != DEFAULT_COLLATION_OID);
- if (collation_cache == NULL)
+ if (CollationCache == NULL)
{
- /* First time through, initialize the hash table */
- HASHCTL ctl;
-
- ctl.keysize = sizeof(Oid);
- ctl.entrysize = sizeof(collation_cache_entry);
- collation_cache = hash_create("Collation cache", 100, &ctl,
- HASH_ELEM | HASH_BLOBS);
+ CollationCacheContext = AllocSetContextCreate(TopMemoryContext,
+ "collation cache",
+ ALLOCSET_DEFAULT_SIZES);
+ CollationCache = collation_cache_create(
+ CollationCacheContext, 16, NULL);
}
- cache_entry = hash_search(collation_cache, &collation, HASH_ENTER, &found);
+ cache_entry = collation_cache_insert(CollationCache, collation, &found);
if (!found)
{
/*
--
2.34.1
On 7/26/24 11:00 PM, Jeff Davis wrote:
Results (in ms):
"C" "libc_c" overhead
master: 6350 7855 24%
v4-0001: 6091 6324 4%
I got more overhead in my quick benchmarking when I ran the same
benchmark. Also tried your idea with caching the last lookup (PoC patch
attached) and it basically removed all overhead, but I guess it will not
help if you have two different non.default locales in the same query.
"C" "libc_c" overhead
before: 6695 8376 25%
after: 6605 7340 11%
cache last: 6618 6677 1%
But even without that extra optimization I think this patch is worth
merging and the patch is small, simple and clean and easy to understand
and a just a clear speed up. Feels like a no brainer. I think that it is
ready for committer.
And then we can discuss after committing if an additional cache of the
last locale is worth it or not.
Andreas
Attachments:
0001-WIP-Ugly-caching-of-last-locale.patchtext/x-patch; charset=UTF-8; name=0001-WIP-Ugly-caching-of-last-locale.patchDownload
From 5f634670569a3ef8249ff1747af2157b6939f505 Mon Sep 17 00:00:00 2001
From: Andreas Karlsson <andreas@proxel.se>
Date: Sun, 28 Jul 2024 00:04:43 +0200
Subject: [PATCH] WIP: Ugly caching of last locale
---
src/backend/utils/adt/pg_locale.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 4628fcd8dd..e0de7aa625 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -1566,6 +1566,9 @@ init_database_collation(void)
ReleaseSysCache(tup);
}
+static Oid last_collid = InvalidOid;
+static pg_locale_t last_locale = NULL;
+
/*
* Create a locale_t from a collation OID. Results are cached for the
* lifetime of the backend. Thus, do not free the result with freelocale().
@@ -1587,6 +1590,9 @@ pg_newlocale_from_collation(Oid collid)
if (collid == DEFAULT_COLLATION_OID)
return &default_locale;
+ if (collid == last_collid)
+ return last_locale;
+
cache_entry = lookup_collation_cache(collid);
if (cache_entry->locale == 0)
@@ -1712,6 +1718,9 @@ pg_newlocale_from_collation(Oid collid)
cache_entry->locale = resultp;
}
+ last_collid = collid;
+ last_locale = cache_entry->locale;
+
return cache_entry->locale;
}
--
2.43.0
On Sun, 2024-07-28 at 00:14 +0200, Andreas Karlsson wrote:
But even without that extra optimization I think this patch is worth
merging and the patch is small, simple and clean and easy to
understand
and a just a clear speed up. Feels like a no brainer. I think that it
is
ready for committer.
Committed, thank you.
And then we can discuss after committing if an additional cache of
the
last locale is worth it or not.
Yeah, I'm holding off on that until refactoring in the area settles,
and we'll see if it's still worth it.
Regards,
Jeff Davis